SPrabhu's Blog

Thursday, April 01, 2021

Minikube on Fedora 33

I have been using minikube with the kvm2 driver to run a test kubernetes environment which I use for my tests. In here I document the steps I took to install minikube on my Fedora machine.

Minikube with kvm is intended to be run by a non-privileged user. In my case, I am use a user which is part of the libvirt group to allow the user to create vms.

Download and install the latest minikube package. This has to be run as root or a user with sudo access.

$ sudo dnf install https://storage.googleapis.com/minikube/releases/latest/minikube-latest.x86_64.rpm

Start up minikube. The command below downloads a kvm image which it then uses to create a virtual machine called minikube. You can see this running by calling 'virsh list' as a root user.

$ minikube start --driver=kvm2
😄 minikube v1.18.1 on Fedora 33
✨ Using the kvm2 driver based on user configuration
💾 Downloading driver docker-machine-driver-kvm2:
    > docker-machine-driver-kvm2....: 65 B / 65 B [----------] 100.00% ? p/s 0s
    > docker-machine-driver-kvm2: 11.39 MiB / 11.39 MiB 100.00% 28.09 MiB p/s
💿 Downloading VM boot image ...
    > minikube-v1.18.0.iso.sha256: 65 B / 65 B [-------------] 100.00% ? p/s 0s
    > minikube-v1.18.0.iso: 212.99 MiB / 212.99 MiB [] 100.00% 37.23 MiB p/s 6s
👍 Starting control plane node minikube in cluster minikube
💾 Downloading Kubernetes v1.20.2 preload ...
    > preloaded-images-k8s-v9-v1....: 491.22 MiB / 491.22 MiB 100.00% 39.17 Mi
🔥 Creating kvm2 VM (CPUs=2, Memory=2200MB, Disk=20000MB) ...
🐳 Preparing Kubernetes v1.20.2 on Docker 20.10.3 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔎 Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v4
🌟 Enabled addons: default-storageclass, storage-provisioner
💡 kubectl not found. If you need it, try: 'minikube kubectl -- get pods -A'
🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

Instead of passing the option --driver=kvm2, you can also set kvm2 to be the default driver.

$ minikube config set driver kvm2

To check status of minikube,

$ minikube status
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured
timeToStop: Nonexistent

Before we can use minikube, we need to install the kubectl utility to access the kubernetes cluster.
As root or a user with sudo access, install package kubernetes-client.

$ sudo dnf install kubernetes-client

kubectl uses the config file under ~/.kube/config

To check if minikube is setup correctly, you can check the version of the client and the server with the command

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"archive", BuildDate:"2020-07-28T00:00:00Z", GoVersion:"go1.15rc1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-13T13:20:00Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

Thursday, June 06, 2019

Howto: CIFS kerberos mount

Steps

1) I use a windows server is available with an AD configured. A samba server with kerberos configured can be used too.

2) Setup /etc/krb5.conf. My test machines use the following.

[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log

[libdefaults]
default_realm = ENG1.GSSLAB.FAB.REDHAT.COM
dns_lookup_realm = true
dns_lookup_kdc = true
allow_weak_crypto = 1

[realms]
ENG1.GSSLAB.FAB.REDHAT.COM = {
kdc = vm140-52.eng1.gsslab.fab.redhat.com:88
}

[domain_realm]
.eng1.gsslab.fab.redhat.com = ENG1.GSSLAB.FAB.REDHAT.COM
eng1.gsslab.fab.redhat.com = ENG1.GSSLAB.FAB.REDHAT.COM

3) Edit /etc/request-key.conf and add the following 2 lines(Read man cifs.upcall)

create cifs.spnego * * /usr/sbin/cifs.upcall %k
create dns_resolver * * /usr/sbin/cifs.upcall %k

4) As root user, init with a AD users credentials

# kinit wintest2
Password for wintest2@ENG1.GSSLAB.FAB.REDHAT.COM:

5) Now mount using the multiuser option to allow multiple users who have authenticated with their own credentials to log in.

# mount -t cifs -o sec=krb5,sign,multiuser vm140-52.eng1.gsslab.fab.redhat.com:/exports /mnt

The multiuser mount option allows a single cifs mount to be used by multiple users using their own credentials. An example is a cifs mount which contains the user's home directories. Instead of individually mounting each user's home directory as they log in, the root user on the client machine can mount the exported homes share under /home. As users login, they access their cifs mounted home directory using their own credentials. A new session is setup each time a new user accesses the share and this session is subsequently used for the user when accessing the share.

Friday, May 17, 2019

Samba multichannel - Connecting to an existing channel

We investigate how a new channel is added to an existing channel on a multichannel connection.

We first need to familiarise ourselves on how a new incoming connection is handled.
http://sprabhu.blogspot.com/2018/03/samba-handling-new-connections.html

To summarise how a new connection is created

a) From the main thread, we call main()->open_sockets_smbd()->smbd_open_one_socket()->tevent_add_fd() to set a tevent handler to call smbd_accept_connection() whenever a new connection is opened with the samba server.

b) For a new connection coming in, the server calls smbd_accept_connection() which forks a child process and calls smbd_process() in the child.

c) Within smbd_process() a new client(struct smbXsrv_client) and a new xconn(struct smbXsrv_connection) are created. The xconn itself is added to the connection list on the new client which was created.

d) Within smbd_add_connection(), we also add a tevent fd handler smbd_server_connection_handler() to handle incoming data on the new socket created for the client.

We also setup the infrastructure necessary to pass the socket file descriptor when a new client is created within smbd_process()->smbXsrv_client_create(), we setup the messaging infrastructure to handle incoming message requests for the message id MSG_SMBXSRV_CONNECTION_PASS.

NTSTATUS smbXsrv_client_create(TALLOC_CTX *mem_ctx,
                               struct tevent_context *ev_ctx,
                               struct messaging_context *msg_ctx,
                               NTTIME now,
                               struct smbXsrv_client **_client)
{
..
        global->server_id = messaging_server_id(client->msg_ctx);
..
        subreq = messaging_filtered_read_send(client,
                                        client->raw_ev_ctx,
                                        client->msg_ctx,
                                        smbXsrv_client_connection_pass_filter,
                                        client);
..
        tevent_req_set_callback(subreq, smbXsrv_client_connection_pass_loop, client);
..
}

ie. For incoming requests for message id MSG_SMBXSRV_CONNECTION_PASS, we call handler smbXsrv_client_connection_pass_loop()

At this point, the socket is established. When data is first sent onto the socket by the client, it is handled by the tevent handler smbd_server_connection_handler() followed by smbd_server_connection_read_handler() which subsequently calls process_smb() to process the incoming request.

static void smbd_server_connection_handler(struct tevent_context *ev,
                                           struct tevent_fd *fde,
                                           uint16_t flags,
                                           void *private_data)
{
..
        //xconn is passed as argument to the tevent callback. We read this argument
        struct smbXsrv_connection *xconn =
                talloc_get_type_abort(private_data,
                struct smbXsrv_connection);
..
        if (flags & TEVENT_FD_READ) {
                smbd_server_connection_read_handler(xconn, xconn->transport.sock);
                return;
        }
}

//Used to handle all incoming read calls.
static void smbd_server_connection_read_handler(
        struct smbXsrv_connection *xconn, int fd)
{
..
process:
        process_smb(xconn, inbuf, inbuf_len, unread_bytes,
                    seqnum, encrypted, NULL);
}

It is here where we start differentiating between SMB1 and later connections

void smbd_smb2_process_negprot(struct smbXsrv_connection *xconn,
                               uint64_t expected_seq_low,
                               const uint8_t *inpdu, size_t size)
{
..
        struct smbd_smb2_request *req = NULL;
..
        //Documented below
        status = smbd_smb2_request_create(xconn, inpdu, size, &req);

..
        status = smbd_smb2_request_dispatch(req);
..
}

static NTSTATUS smbd_smb2_request_create(struct smbXsrv_connection *xconn,
                                         const uint8_t *_inpdu, size_t size,
                                         struct smbd_smb2_request **_req)
{
        struct smbd_server_connection *sconn = xconn->client->sconn;
..
        struct smbd_smb2_request *req;
..
        req = smbd_smb2_request_allocate(xconn);
..
        req->sconn = sconn;
        req->xconn = xconn;
..
        status = smbd_smb2_inbuf_parse_compound(xconn,
                                                now,
                                                inpdu,
                                                size,
                                                req, &req->in.vector,
                                                &req->in.vector_count);
..
        *_req = req;
        return NT_STATUS_OK;
}

At this point the buffer containing the incoming request is stored in the smbd_smb2_request *req.

We call smbd_smb2_request_dispatch() to handle the data.

NTSTATUS smbd_smb2_request_dispatch(struct smbd_smb2_request *req)
{
        struct smbXsrv_connection *xconn = req->xconn;
..
        /*
         * Check if the client provided a valid session id.
         *
         * As some command don't require a valid session id
         * we defer the check of the session_status
         */
        session_status = smbd_smb2_request_check_session(req);
..
        flags = IVAL(inhdr, SMB2_HDR_FLAGS);
        opcode = SVAL(inhdr, SMB2_HDR_OPCODE);
        mid = BVAL(inhdr, SMB2_HDR_MESSAGE_ID);
..
        switch (opcode) {
..
        case SMB2_OP_NEGPROT:
                SMBPROFILE_IOBYTES_ASYNC_START(smb2_negprot, profile_p,
                                               req->profile, _INBYTES(req));
                return_value = smbd_smb2_request_process_negprot(req);
                break;
..
}

Since this is the first call sent by the client, it is a negotiate request which is handled by smbd_smb2_request_process_negprot().

NTSTATUS smbd_smb2_request_process_negprot(struct smbd_smb2_request *req)
{
..
        //Obtain the GUID passed i
        in_guid_blob = data_blob_const(inbody + 0x0C, 16);
..
        status = GUID_from_ndr_blob(&in_guid_blob, &in_guid);
..
        xconn->smb2.client.guid = in_guid;
..
        if (xconn->protocol < PROTOCOL_SMB2_10) {
                /*
                 * SMB2_02 doesn't support client guids
                 */
                return smbd_smb2_request_done(req, outbody, &outdyn);
        }
        //Only SMB3 and later protocols here.

        if (!xconn->client->server_multi_channel_enabled) {
                /*
                 * Only deal with the client guid database
                 * if multi-channel is enabled.
                 */
                return smbd_smb2_request_done(req, outbody, &outdyn);
        }
        //Only clients with multichannel enabled here.
..
        status = smb2srv_client_lookup_global(xconn->client,
                                              xconn->smb2.client.guid,
                                              req, &global0);
..
        if (NT_STATUS_EQUAL(status, NT_STATUS_OBJECTID_NOT_FOUND)) {
        //If no existing connection is found, set it up.
                xconn->client->global->client_guid =
                        xconn->smb2.client.guid;
                status = smbXsrv_client_update(xconn->client);
..
                xconn->smb2.client.guid_verified = true;
        } else if (NT_STATUS_IS_OK(status)) {
        //We have found an existing client with the same guid.
        //So pass the connection to the original smbd process.
                status = smb2srv_client_connection_pass(req,
                                                        global0);

                if (!NT_STATUS_IS_OK(status)) {
                        return smbd_smb2_request_error(req, status);
                }
        //and terminate this connection.
                smbd_server_connection_terminate(xconn,
                                                 "passed connection");
                return NT_STATUS_OBJECTID_EXISTS;
        } else {
                return smbd_smb2_request_error(req, status);
        }

}

NTSTATUS smb2srv_client_connection_pass(struct smbd_smb2_request *smb2req,
                                        struct smbXsrv_client_global0 *global)
{
..
        pass_info0.initial_connect_time = global->initial_connect_time;
        pass_info0.client_guid = global->client_guid;
..
        pass_info0.negotiate_request.length = reqlen;
        pass_info0.negotiate_request.data = talloc_array(talloc_tos(), uint8_t,
                                                         reqlen);
..
        iov_buf(smb2req->in.vector, smb2req->in.vector_count,
                pass_info0.negotiate_request.data,
                pass_info0.negotiate_request.length);

        ZERO_STRUCT(pass_blob);
        pass_blob.version = smbXsrv_version_global_current();
        pass_blob.info.info0 = &pass_info0;
..
        ndr_err = ndr_push_struct_blob(&blob, talloc_tos(), &pass_blob,
                        (ndr_push_flags_fn_t)ndr_push_smbXsrv_connection_passB);
..
        //Add the created data blobs to an iov
        iov.iov_base = blob.data;
        iov.iov_len = blob.length;

        //and send the iovs to the original thread using
        //message id MSG_SMBXSRV_CONNECTION_PASS.
        status = messaging_send_iov(smb2req->xconn->client->msg_ctx,
                                    global->server_id,
                                    MSG_SMBXSRV_CONNECTION_PASS,
                                    &iov, 1,
                                    &smb2req->xconn->transport.sock, 1);
..
}

At this point, the smbd process for the new process sends the original smbd process a message with the data required to transfer the channel to the original process.

We call the handler for the message and process the incoming data.

static void smbXsrv_client_connection_pass_loop(struct tevent_req *subreq)
{
..
        //We read data from the iovs passed in the message.
..
        //We perform some sanity tests.
..
        SMB_ASSERT(rec->num_fds == 1);
        sock_fd = rec->fds[0];
..
        //We add the new connection to the original smbd process client.
        status = smbd_add_connection(client, sock_fd, &xconn);
..
        //We process the negprot on the original thread.
        xconn->smb2.client.guid_verified = true;
        smbd_smb2_process_negprot(xconn, seq_low,
                                  pass_info0->negotiate_request.data,
                                  pass_info0->negotiate_request.length);
..
}

At this point, we have
a) Added a new connection xconn to the existing client from the original connection.
b) Set the data handler for the socket file descriptor to smbd_server_connection_handler() so that any incoming data is handled by the samba thread handling the original connection.
c) Terminated the new samba thread created for the new channel and handle all new incoming request in handler specified in b.

Friday, February 15, 2019

ceph-ansible: Installing Ceph

My previous post deals with using Vagrant to install CentOS based test systems to install Ceph on. There, I created 7 virtual machines which include
a) a machine to run ansible commands on,
b) three monitors(mons) and
c) three Object storage devices(OSDs) which each contain an additional 5Gb block device available to the OSD daemon.

Since I use Fedora 29 as my host system, I do not need the ansible vm and disabled it by commenting out the line from the array in the Vagrantfile I posted in the previous post.

The steps detailed below can be run either directly on the host machine or if needed, on the ansible vm with suitable modifications.

Install Ansible on your vm or your host machine.

$ sudo dnf install -y ansible
..
$ rpm -q ansible
ansible-2.7.5-1.fc29.noarch

Obtain the latest ceph-ansible using git

$ git clone https://github.com/ceph/ceph-ansible.git

I wanted to install the "Luminous" version of Ceph. According to the ceph-ansible documentation at
http://docs.ceph.com/ceph-ansible/master/#releases
I need the stable-3.2 branch of ceph-ansible. This will only work with Ansible version 2.6.

From the commands above, we have the 2.7.5-1 version of ansible. We need to downgrade our ansible package.

$ sudo dnf downgrade ansible
..
$ rpm -q ansible
ansible-2.6.5-1.fc29.noarch

We now have to go into the ceph-ansible directory and change to the stable-3.2 branch. I then like to create a branch of my own with the configuration files I need.

$ cd ceph-ansible
$ git checkout stable-3.2
..
$ git checkout -b ceph-test
Switched to a new branch 'ceph-test'

To get ceph-ansible to work, I've also had to separately install python3-pyyaml and python3-notario.

$ sudo dnf install python3-pyyaml
$ sudo dnf install python3-notario

We are not ready to configure ceph-ansible to start the installation.

First create the hosts file containing the machines you would like to use.

[mons]
mon1
mon2
mon3

[osds]
osd1
osd2
osd3

Then create group_vars/all.yml with the content

ceph_origin: 'repository'
ceph_repository: community
ceph_stable_release: luminous
public_network: "192.168.145.0/24"
monitor_interface: eth1
journal_size: 1024
devices:
  - /dev/vdb
osd_scenario: lvm

I use the community repository at http://download.ceph.com to download the Luminous release.
More information at http://docs.ceph.com/ceph-ansible/master/installation/methods.html

ceph_origin: 'repository'
ceph_repository: community
ceph_stable_release: luminous

I create the test machines with the private addresses 192.168.145.0/24 subnet. These are created as eth1 on my KVM based test machines.

public_network: "192.168.145.0/24"
monitor_interface: eth1

The block devices for the OSDs are created as /dev/vdb on these KVM based test machines.

devices:
  - /dev/vdb

The following line was needed for this version of ceph-ansible and describes how ceph-volume creates the devices on the OSD. More information is available at http://docs.ceph.com/ceph-ansible/master/osds/scenarios.html

osd_scenario: lvm

You can look over group_vars/all.yml.sample to look at various configuration options available to you.

Copy over site.yml.

$ cp site.yml.sample site.yml

Make sure that test test machines mon1, mon2, mon3, osd1, osd2, osd3 have been started up using "vagrant up". You can now start deploying ansible on the machine with the command

$ ansible-playbook -i hosts -u root site.yml

This takes several minutes at the end of which you have a ceph cluster installed on your test virtual machines.

At this point, you can optionally commit the changes to the git repo so that you can continue to experiment with various settings and then roll back to the working copy if needed.

Next ssh into mon1 as root and run the "ceph health" and "ceph status" commands

[root@mon1 ~]# ceph health
HEALTH_WARN no active mgr

[root@mon1 ~]# ceph status
cluster:
    id:     9de96055-aba6-4837-ac8e-12156bb7335c
    health: HEALTH_WARN
            no active mgr

services:
    mon: 3 daemons, quorum mon1,mon2,mon3
    mgr: no daemons active
    osd: 3 osds: 3 up, 3 in

data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   0B used, 0B / 0B avail
    pgs:

The warning message "no active mgr" is seen because Ceph, since Luminous requires the ceph-mgr daemon running alongside its monitor daemons to provide additional monitoring and to allow external monitoring tools to monitor the system through the interface it provides.

We will cover the installation of the ceph-mgr daemon in the next post.

Using Vagrant to create KVM based vms to test Ceph

Ceph, the software defined storage solution has been growing in popularity especially as a cloud storage platform. To learn more about this product, I purchased the book 'Mastering Ceph: Redefine your storage system' by Nick Fisk.

The first hurdle when using the book was that the examples provided in the book rely on using Vagrant with Virtual box to create test machines which themselves were running Ubuntu. I use a Fedora 29 machine and would like to use CentOS on KVM instead for my setup.

To get the test machines setup for my environment, I've had to deviate from the instructions given in the book. This post is based on my notes. These notes should also benefit those who are just looking to use Vagrant with KVM.

First install Vagrant and the libvirt plugin for vagrant.

sudo dnf install -y vagrant vagrant-libvirt

I use the directory ~/vagrant/ceph as the location for the Vagrantfile for my test machines.

My Vagrant file is as follows

storage_pool_name = "vagrant-pool"

nodes = [
  { :hostname => 'ansible', :ip => '192.168.145.40', :box => 'centos/7' },
  { :hostname => 'mon1', :ip => '192.168.145.41', :box => 'centos/7' },
  { :hostname => 'mon2', :ip => '192.168.145.42', :box => 'centos/7' },
  { :hostname => 'mon3', :ip => '192.168.145.43', :box => 'centos/7' },
  { :hostname => 'osd1', :ip => '192.168.145.51', :box => 'centos/7', :ram =>1024, :osd => 'yes' },
  { :hostname => 'osd2', :ip => '192.168.145.52', :box => 'centos/7', :ram =>1024, :osd => 'yes' },
  { :hostname => 'osd3', :ip => '192.168.145.53', :box => 'centos/7', :ram =>1024, :osd => 'yes' },
]

Vagrant.configure("2") do |config|
  config.ssh.forward_agent = true
  config.ssh.insert_key = false
  config.ssh.private_key_path = ["~/.vagrant.d/insecure_private_key","~/.ssh/id_rsa"]
  config.vm.provision :shell, privileged: false do |s|
    ssh_pub_key = File.readlines("#{Dir.home}/.ssh/id_rsa.pub").first.strip
    s.inline = <<-SHELL
      echo #{ssh_pub_key} >> /home/$USER/.ssh/authorized_keys
      sudo mkdir -m 0700 /root/.ssh/
      sudo bash -c "echo #{ssh_pub_key} >> /root/.ssh/authorized_keys"
    SHELL
  end

  nodes.each do |node|
    config.vm.define node[:hostname] do |nodeconfig|
      nodeconfig.vm.box = node[:box]
      nodeconfig.vm.hostname = node[:hostname]
      nodeconfig.vm.network :private_network, ip: node[:ip]

      memory = node[:ram] ? node[:ram] : 512;
      nodeconfig.vm.provider :libvirt do |lv|
        lv.storage_pool_name = storage_pool_name
        lv.driver = "kvm"
        lv.uri = "qemu:///system"
        lv.memory = memory
        lv.graphics_type = "none"
        if node[:osd] == "yes"
          lv.storage :file, :size => '5G'
        end
      end
    end
  end
end

Going through this step-by-step

    storage_pool_name = "vagrant-pool"

I use a variable storage_pool_name to store the name of the storage pool name in libvirt. This pool is created as a 'Filesystem Directory' on my laptop which has been named "vagrant-pool".

nodes = [
  { :hostname => 'ansible', :ip => '192.168.145.40', :box => 'centos/7' },
  { :hostname => 'mon1', :ip => '192.168.145.41', :box => 'centos/7' },
  { :hostname => 'mon2', :ip => '192.168.145.42', :box => 'centos/7' },
  { :hostname => 'mon3', :ip => '192.168.145.43', :box => 'centos/7' },
  { :hostname => 'osd1', :ip => '192.168.145.51', :box => 'centos/7', :ram =>1024, :osd => 'yes' },
  { :hostname => 'osd2', :ip => '192.168.145.52', :box => 'centos/7', :ram =>1024, :osd => 'yes' },
  { :hostname => 'osd3', :ip => '192.168.145.53', :box => 'centos/7', :ram =>1024, :osd => 'yes' },
]

This is an array or dictionaries containing details of the machines to be setup.

    Vagrant.configure("2") do |config|

    ..

    end

The main configuration block used by Vagrant.

I then initialise common settings for all the test machines. I've commented inline.

      #Use my personal ssh key across the test machines without having to copy my private key to the test machines.
      config.ssh.forward_agent = true

      #Do not regenerate a new key for each test machine.
      config.ssh.insert_key = false

      # The private keys I use. This is used for "vagrant ssh".
      config.ssh.private_key_path = ["~/.vagrant.d/insecure_private_key","~/.ssh/id_rsa"]

      #This block reads plublic keys and appends it to .ssh/authorized_keys for the user and root account.
      config.vm.provision :shell, privileged: false do |s|
        ssh_pub_key = File.readlines("#{Dir.home}/.ssh/id_rsa.pub").first.strip
        s.inline = <<-SHELL
          echo #{ssh_pub_key} >> /home/$USER/.ssh/authorized_keys
          sudo mkdir -m 0700 /root/.ssh/
          sudo bash -c "echo #{ssh_pub_key} >> /root/.ssh/authorized_keys"
        SHELL
      end

We then iterate through the list of nodes in the nodes array setting up a test box for each node with the described features in the block.

  #For each node in the nodes array.
  nodes.each do |node|
    #define a new test box with name :hostname.
    config.vm.define node[:hostname] do |nodeconfig|
      #This is set to centos/7 for all the nodes.
      nodeconfig.vm.box = node[:box]
      #This is the name to be set.
      nodeconfig.vm.hostname = node[:hostname]
      #Set an ip address given in the private network.
      nodeconfig.vm.network :private_network, ip: node[:ip]

      #If a value has been provided for ram, we use that or default to 512M
      memory = node[:ram] ? node[:ram] : 512;
      #We configure testmachine in libvirt
      nodeconfig.vm.provider :libvirt do |lv|
        #This is set to "vagrant-pool" we created earlier
        lv.storage_pool_name = storage_pool_name
        #Use KVM.
        lv.driver = "kvm"
        lv.uri = "qemu:///system"
        lv.memory = memory
        lv.graphics_type = "none"

        #Create a new storage of 5G if it is an OSD.
        if node[:osd] == "yes"
          lv.storage :file, :size => '5G'
        end
      end
    end
  end

To create the test machines, we simply call

$ vagrant up

We can also call up individual machines by providing a list of names.

$ vagrant up mon1 osd1

On first run, vagrant will download an image of CentOS 7. Subsequent runs will be faster.

We can suspend and resume using the commands

$ vagrant suspend
$ vagrant resume

This ensures that the virtual machines are available later when you get back to it.

When we are done and no longer need the machines, we can use the following command to stop and delete them.

$ vagrant destroy

As with "vagrant up", we can provide machine names.

To complete the setup, I add the following to my /etc/hosts file.

#Vagrant hosts
192.168.145.40  ansible
192.168.145.41  mon1
192.168.145.42  mon2
192.168.145.43  mon3
192.168.145.51  osd1
192.168.145.52  osd2
192.168.145.53  osd3

Since I will be recreating these test machines several times, I do not want to keep modifying the ~/.ssh/known_hosts because of the default StrictHostKeyChecking I have on my main setup.
I add the following lines to ~/.ssh/config.

#Vagrant hosts
HOST mon?
        StrictHostKeyChecking no
HOST osd?
        StrictHostKeyChecking no

You can now test by sshing into the test machines
$ ssh root@mon1

Monday, October 08, 2018

Samba: Triggering oplock breaks

I have been investigating the Samba code as part of my task to implement oplock break retry code for a multichannel setup. This is a small extract from my notes which looks into how the oplock break is triggered on the samba server.

Registering the oplock break handler

The smbd server forks a new smbd process to handle a new incoming request by a SMB client.

As part of initialisation of the new smbd process an oplock break handler is initialised.

void smbd_process(struct tevent_context *ev_ctx,
                  struct messaging_context *msg_ctx,
                  int sock_fd,
                  bool interactive)
{
..
        status = smbXsrv_client_create(ev_ctx, ev_ctx, msg_ctx, now, &client);
..
        status = smbd_add_connection(client, sock_fd, &xconn);
..
        /* Setup oplocks */
        if (!init_oplocks(sconn))
                exit_server("Failed to init oplocks");
..
}

The oplock message queue is registered along with the necessary call backs.

bool init_oplocks(struct smbd_server_connection *sconn)

{

        DEBUG(3,("init_oplocks: initializing messages.\n"));



        messaging_register(sconn-&gt;msg_ctx, sconn, MSG_SMB_BREAK_REQUEST,

                           process_oplock_break_message);

        messaging_register(sconn-&gt;msg_ctx, sconn, MSG_SMB_KERNEL_BREAK,

                           process_kernel_oplock_break);

        return true;

}

This is the oplock break handler which handles oplock break requests coming in for files opened by this smbd process.

Triggering an oplock break

For another smbd process attempting to open the file, we end up calling the handler open_file_ntcreate().

In this case, we are not looking at the scenario where kernel oplocks are enabled. Kernel oplocks are useful when you other processes(eg: NFS) using the filesystem exported by samba at the same time. Without kernel oplocks, other processes cannot safely use the filesystem since the locking info is stored by samba in its own databases.

tatic NTSTATUS open_file_ntcreate(connection_struct *conn,

                            struct smb_request *req,

                            uint32_t access_mask,               /* access bits (FILE_READ_DATA etc.) */

                            uint32_t share_access,      /* share constants (FILE_SHARE_READ etc) */

                            uint32_t create_disposition,        /* FILE_OPEN_IF etc. */

                            uint32_t create_options,    /* options such as delete on close. */

                            uint32_t new_dos_attributes,        /* attributes used for new file. */

                            int oplock_request,         /* internal Samba oplock codes. */

                            struct smb2_lease *lease,

                                                        /* Information (FILE_EXISTS etc.) */

                            uint32_t private_flags,     /* Samba specific flags. */

                            int *pinfo,

                            files_struct *fsp)

{

..

        /* ignore any oplock requests if oplocks are disabled */

        if (!lp_oplocks(SNUM(conn)) ||

            IS_VETO_OPLOCK_PATH(conn, smb_fname-&gt;base_name)) {

                /* Mask off everything except the private Samba bits. */

                oplock_request &amp;= SAMBA_PRIVATE_OPLOCK_MASK;

        }

..

 //First open the file

        fsp_open = open_file(fsp, conn, req, parent_dir,

                             flags|flags2, unx_mode, access_mask,

                             open_access_mask, &amp;new_file_created);

..

 //Fetch the share mode from the database or allocate a fresh one if record doesn't exist.

        lck = get_share_mode_lock(talloc_tos(), id,

                                  conn-&gt;connectpath,

                                  smb_fname, &amp;old_write_time);

..

 //Check to see if oplocks are set and if they violate the share mode

        status = open_mode_check(conn, lck,

                                 access_mask, share_access);

..

 //If there is a sharing violation, delay for oplock.

        if (req != NULL) {

                /*

                 * Handle oplocks, deferring the request if delay_for_oplock()

                 * triggered a break message and we have to wait for the break

                 * response.

                 */

                bool delay;

                bool sharing_violation = NT_STATUS_EQUAL(

                        status, NT_STATUS_SHARING_VIOLATION);



  //Here we end up calling send_break_message() to the smbd pid which opened the file 1st.

                delay = delay_for_oplock(fsp, oplock_request, lease, lck,

                                         sharing_violation,

                                         create_disposition,

                                         first_open_attempt);

                if (delay) {

                        schedule_defer_open(lck, fsp-&gt;file_id,

                                            request_time, req);

                        TALLOC_FREE(lck);

                        fd_close(fsp);

                        return NT_STATUS_SHARING_VIOLATION;

                }

        }

..

 //And finally set the new oplock for the file.

        /*

         * Setup the oplock info in both the shared memory and

         * file structs.

         */

        status = grant_fsp_oplock_type(req, fsp, lck, oplock_request, lease);

        if (!NT_STATUS_IS_OK(status)) {

                TALLOC_FREE(lck);

                fd_close(fsp);

                return status;

        }



}

Thus the oplock break is trigerred.

Monday, March 05, 2018

Samba: Handling new connections

Samba uses the tevents library to handle new incoming connections. The Samba server makes use of event handling to perform tasks such as creating a new process for each new client connection and to further handle new requests made by this client. Before the Samba server can handle these events, the events meant to be caught have to be registered along with a handler which handles these events.

More information on the tevent library is available at
https://tevent.samba.org/tevent_tutorial.html

We look at the code patch at the start of the smbd process. We start in main().

 int main(int argc,const char *argv[])
{
..
        struct tevent_context *ev_ctx;
..
        /*
         * Initialize the event context. The event context needs to be
         * initialized before the messaging context, cause the messaging
         * context holds an event context.
         */
        // This eventually returns tevent_context_init(NULL)
        ev_ctx = server_event_context();
        if (ev_ctx == NULL) {
                exit(1);
        }
..
        //Read notes on this function below.
        if (!open_sockets_smbd(parent, ev_ctx, msg_ctx, ports))
                exit_server("open_sockets_smbd() failed");
..
        //Loop and wait for events.
        //This function is where the tevent contexts such as handling
        //incoming requests or reading new information on the socket.
        smbd_parent_loop(ev_ctx, parent);
..
}

The function which opens sockets and adds the necessary tevents required to handle socket communication

static bool open_sockets_smbd(struct smbd_parent_context *parent,
                              struct tevent_context *ev_ctx,
                              struct messaging_context *msg_ctx,
                              const char *smb_ports)
{
..
                                /*
                                 * If we fail to open any sockets
                                 * in this loop the parent-sockets == NULL
                                 * case below will prevent us from starting.
                                 */

                                (void)smbd_open_one_socket(parent,
                                                  ev_ctx,
                                                  &ss,
                                                  port);
..
}

static bool smbd_open_one_socket(struct smbd_parent_context *parent,
                                 struct tevent_context *ev_ctx,
                                 const struct sockaddr_storage *ifss,
                                 uint16_t port)
{
..
        s->fd = open_socket_in(SOCK_STREAM,
                               port,
                               parent->sockets == NULL ? 0 : 2,
                               ifss,
                               true);
..
        /* ready to listen */
        set_socket_options(s->fd, "SO_KEEPALIVE");
        set_socket_options(s->fd, lp_socket_options());

        /* Set server socket to
         * non-blocking for the accept. */
        set_blocking(s->fd, False);
..
        //This sets the tevent context for the parent sockets.
        //The handling function smbd_accept_connection()
        //is responsible for accepting new client connections.
        s->fde = tevent_add_fd(ev_ctx,
                               s,
                               s->fd, TEVENT_FD_READ,
                               smbd_accept_connection,
                               s);
..
        tevent_fd_set_close_fn(s->fde, smbd_open_socket_close_fn);
..
}

On the parent processes, the process loops waiting for events to be handled.

int main(int argc,const char *argv[])
{
..
        smbd_parent_loop(ev_ctx, parent);
..
}

A new incoming request triggers the tevent context for the fd which has the handling function set to
smbd_accept_connection().

static void smbd_accept_connection(struct tevent_context *ev,
                                   struct tevent_fd *fde,
                                   uint16_t flags,
                                   void *private_data)
{
..
        pid = fork();
        //For child process
        if (pid == 0) {
..
                //Process the incoming request.
                smbd_process(ev, msg_ctx, fd, false);
         exit:
                exit_server_cleanly("end of child");
                return;
        }
        //For the parent process ie. main smbd process.

        /* The parent doesn't need this socket */
        close(fd);
..
        if (pid != 0) {
                add_child_pid(s->parent, pid);
        }
..
}

The parent process at this point forks a child process which is used to handle the new client. The parent process continues looping in main().

Below is the code path followed by the clild process.

void smbd_process(struct tevent_context *ev_ctx,
                  struct messaging_context *msg_ctx,
                  int sock_fd,
                  bool interactive)
{
..
        struct smbXsrv_client *client = NULL;
        ..
        struct smbXsrv_connection *xconn = NULL;
..
        //Create a new struct to store information about this client
        status = smbXsrv_client_create(ev_ctx, ev_ctx, msg_ctx, now, &client);
..
        //The connection information itself is stored in xconn
        status = smbd_add_connection(client, sock_fd, &xconn);
..
        //Loop and wait for new events.
        ret = tevent_loop_wait(ev_ctx);
..
}

With multichannel support, we can have multiple connections connected to the same client. We do not consider multichannel in this document.

NTSTATUS smbd_add_connection(struct smbXsrv_client *client, int sock_fd,
                             struct smbXsrv_connection **_xconn)
{
..
        xconn = talloc_zero(client, struct smbXsrv_connection);
..
        xconn->transport.fde = tevent_add_fd(client->ev_ctx,
                                             xconn,
                                             sock_fd,
                                             TEVENT_FD_READ,
                                             smbd_server_connection_handler,
                                             xconn);
..
        /* for now we only have one connection */
        DLIST_ADD_END(client->connections, xconn);
        xconn->client = client;
..
}

The new connection represented by struct smbXsrv_connection xconn is now connected to the client.
The tevent handler for this socket is now changed to smbd_server_connection_handler().

The child smbd process is now attached to a single client connection. It loops in smbd_process()

void smbd_process(struct tevent_context *ev_ctx,
                  struct messaging_context *msg_ctx,
                  int sock_fd,
                  bool interactive)
{
..
        //Loop and wait for new events.
        ret = tevent_loop_wait(ev_ctx);
..
}

Any new data now sent to this socket will trigger the event handler smbd_server_connection_handler().

static void smbd_server_connection_handler(struct tevent_context *ev,
                                           struct tevent_fd *fde,
                                           uint16_t flags,
                                           void *private_data)
{
..
        if (!NT_STATUS_IS_OK(xconn->transport.status)) {
                /*
                 * we're not supposed to do any io
                 */
                TEVENT_FD_NOT_READABLE(xconn->transport.fde);
                TEVENT_FD_NOT_WRITEABLE(xconn->transport.fde);
                return;
        }
..
        if (flags & TEVENT_FD_WRITE) {
                smbd_server_connection_write_handler(xconn);
                return;
        }
        if (flags & TEVENT_FD_READ) {
                smbd_server_connection_read_handler(xconn, xconn->transport.sock);
                return;
        }
}

static void smbd_server_connection_read_handler(
        struct smbXsrv_connection *xconn, int fd)
{
..
        status = receive_smb_talloc(mem_ctx, xconn, fd,
                                    (char **)(void *)&inbuf,
                                    0, /* timeout */
                                    &unread_bytes,
                                    &encrypted,
                                    &inbuf_len, &seqnum,
                                    !from_client /* trusted channel */);
..
        process_smb(xconn, inbuf, inbuf_len, unread_bytes,
                    seqnum, encrypted, NULL);
}

The call is then processed by the samba server as needed in process_smb.