NFS-Ganesha 2.3 is rapidly winding down to release and it has a bunch of new things in it that make it fairly compelling. A lot of people are also starting to use Red Hat Gluster Storage with the NFS-Ganesha NFS server that is part of that package. Setting up a highly available NFS-Ganesha system using GlusterFS is not exactly trivial. This blog post will “eat the elephant” one bite at a time.
Some people might wonder why use NFS-Ganesha — a user space NFS server — when kernel NFS (knfs) already supports NFSv4? The answer is simple really. NFSv4 in the kernel doesn’t scale. It doesn’t scale out, and it’s a single point of failure. This blog post will show how to set up a resilient, highly available system with no single point of failure.
Crawl
Let’s start small and simple. We’ll set up a single NFS-Ganesha server on CentOS 7, serving a single disk volume.
Start by setting up a CentOS 7 machine. You may want to create a separate volume for the NFS export. We’ll leave this as an exercise for the reader. do not install any NFS.
1. Install EPEL, NFS-Ganesha and GlusterFS. Use the yum repos on download.gluster.org. Repo files are at
nfs-ganesha.repo and glusterfs-epel.repo. Copy them to /etc/yum.repos.d.
% yum -y install epel-release
% yum -y install glusterfs-server glusterfs-fuse glusterfs-cli glusterfs-ganesha
% yum -y install nfs-ganesha-xfs
2. Create a directory to mount the export volume, make a file system on the export volume, and finally mount it:
% mkdir -p /bricks/demo
% mkfs.xfs /dev/sdb
% mount /dev/sdb /bricks/demo
3. Gluster recommends not creating volumes on the root directory of the brick. If something goes wrong it’s easier rm -rf the directory than it is to try and clean it or remake the file system. Create a couple subdirs on the brick:
% mkdir /bricks/demo/vol
% mkdir /bricks/demo/scratch
4. Edit the Ganesha config file at /etc/ganesha/ganesha.conf. Here’s what mine looks like:
EXPORT { # Export Id (mandatory, each EXPORT must have a unique Export_Id) Export_Id = 1; # Exported path (mandatory) Path = /bricks/demo/scratch; # Pseudo Path (required for NFS v4) Pseudo = /bricks/demo/scratch; # Required for access (default is None) # Could use CLIENT blocks instead Access_Type = RW; # Exporting FSAL FSAL { Name = XFS; } }
5. Start ganesha:
% systemctl start nfs-ganesha
6. Wait one minute for NFS grace to end, then mount the volume:
% mount localhost:/scratch /mnt
Walk
7. Now we’ll create a simple gluster volume and use NFS_Ganesha to serve it. We also need to disable gluster’s nfs (gnfs).
% gluster volume create simple $hostname:/bricks/demo/simple
% gluster volume set simple nfs.disable on
% gluster volume start simple
8. Edit the Ganesha config file at /etc/ganesha/ganesha.conf. Here’s what mine looks like:
EXPORT { # Export Id (mandatory, each EXPORT must have a unique Export_Id) Export_Id = 1; # Exported path (mandatory) Path = /simple; # Pseudo Path (required for NFS v4) Pseudo = /simple; # Required for access (default is None) # Could use CLIENT blocks instead Access_Type = RW; # Exporting FSAL FSAL { Name = GLUSTER; Hostname = localhost; Volume = simple; } }
9. Restart ganesha:
% systemctl stop nfs-ganesha
% systemctl start nfs-ganesha
10. Wait one minute for NFS grace to end, then mount the volume:
% mount localhost:/simple /mnt
Copy a file to the NFS volume. You’ll see it on the gluster brick in /bricks/demo/simple.
Run
Now for the part you’ve been waiting for. For this we’ll start from scratch. This will be a four node cluster: node0, node1, node2, and node3.
1. Tear down anything left over from the above.
2. Ensure that all nodes are resolvable either in DNS or /etc/hosts:
node0% cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.3.130 node0
172.16.3.131 node1
172.16.3.132 node2
172.16.3.133 node3
172.16.3.140 node0v
172.16.3.141 node1v
172.16.3.142 node2v
172.16.3.143 node3v
3. Set up passwordless ssh among the four nodes. On node1 create a keypair and deploy it to all the nodes:
node0% ssh-keygen -f /var/lib/glusterd/nfs/secret.pem
node0% ssh-copy-id -i /var/lib/glusterd/nfs/secret.pem.pub root@node0
node0% ssh-copy-id -i /var/lib/glusterd/nfs/secret.pem.pub root@node1
node0% ssh-copy-id -i /var/lib/glusterd/nfs/secret.pem.pub root@node2
node0% ssh-copy-id -i /var/lib/glusterd/nfs/secret.pem.pub root@node3
node0% scp /var/lib/glusterd/nfs/secret.* node1:/var/lib/glusterd/nfs/
node0% scp /var/lib/glusterd/nfs/secret.* node2:/var/lib/glusterd/nfs/
node0% scp /var/lib/glusterd/nfs/secret.* node3:/var/lib/glusterd/nfs/
You can confirm that it works with:
node0% ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/nfs/secret.pem root@node1
4. Start glusterd on all nodes:
node0% systemctl enable glusterd && systemctl start glusterd
node1% systemctl enable glusterd && systemctl start glusterd
node2% systemctl enable glusterd && systemctl start glusterd
node3% systemctl enable glusterd && systemctl start glusterd
5. From node0, peer probe the other nodes:
node0% gluster peer probe node1
peer probe: success
node0% gluster peer probe node2
peer probe: success
node0% gluster peer probe node3
peer probe: success
You can confirm their status with:
node0% gluster peer status
Number of Peers: 3
Hostname: node1
Uuid: ca8e1489-0f1b-4814-964d-563e67eded24
State: Peer in Cluster (Connected)
Hostname: node2
Uuid: 37ea06ff-53c2-42eb-aff5-a1afb7a6bb59
State: Peer in Cluster (Connected)
Hostname: node3
Uuid: e1fb733f-8e4e-40e4-8933-e215a183866f
State: Peer in Cluster (Connected)
6. Create the /etc/ganesha/ganesha-ha.conf file on node0. Here’s what mine looks like:
# Name of the HA cluster created. # must be unique within the subnet HA_NAME="demo-cluster" # # The gluster server from which to mount the shared data volume. HA_VOL_SERVER="node0" # # You may use short names or long names; you may not use IP addresses. # Once you select one, stay with it as it will be mildly unpleasant to clean up if you switch later on. Ensure that all names - short and/or long - are in DNS or /etc/hosts on all machines in the cluster. # # The subset of nodes of the Gluster Trusted Pool that form the ganesha HA cluster. Hostname is specified. HA_CLUSTER_NODES="node0,node1,node2,node3" # # Virtual IPs for each of the nodes specified above. VIP_node0="172.16.3.140" VIP_node1="172.16.3.141" VIP_node2="172.16.3.142" VIP_node33="172.16.3.143"
7. Enable the Gluster shared state volume:
node0% gluster volume set all cluster.enable-shared-storage enable
8. Enable and start the Pacemaker pcsd on all nodes:
node0% systemctl enable pcsd && systemctl start pcsd
node1% systemctl enable pcsd && systemctl start pcsd
node2% systemctl enable pcsd && systemctl start pcsd
node3% systemctl enable pcsd && systemctl start pcsd
9. Set a password for the user ‘hacluster’ on all nodes. Use the same password for all nodes:
node0% echo demopass | passwd --stdin hacluster
node1% echo demopass | passwd --stdin hacluster
node2% echo demopass | passwd --stdin hacluster
node3% echo demopass | passwd --stdin hacluster
10. Perform cluster auth between the nodes. Username is ‘hacluster’, Password is the one you used in step 8:
node0% pcs cluster auth node0
node0% pcs cluster auth node1
node0% pcs cluster auth node2
node0% pcs cluster auth node3
11. Create the Gluster volume to export. We’ll create a 2×2 distribute-replicate volume. Start the volume:
node0% gluster volume create cluster-demo replica 2 node0:/home/bricks/demo node1:/home/bricks/demo node2:/home/bricks/demo node3:/home/bricks/demo
node0% gluster volume start cluster-demo
12. Edit the Ganesha config file at /etc/ganesha/ganesha.conf on node0. Here’s what mine looks like:
EXPORT { # Export Id (mandatory, each EXPORT must have a unique Export_Id) Export_Id = 1; # Exported path (mandatory) Path = /demo; # Pseudo Path (required for NFS v4) Pseudo = /demo; # Required for access (default is None) # Could use CLIENT blocks instead Access_Type = RW; # Exporting FSAL FSAL { Name = GLUSTER; Hostname = localhost; Volume = cluster-demo; } }
13.
node0% gluster nfs-ganesha enable
</code