Setting up clustered NFS
Prereqs
Configure CTDB as above and set it up to use public ipaddresses.
Verify that the CTDB cluster works.
sm-notify
Make sure you have the sm-notify tool installed in /usr/sbin.
You should find that tool in the nfs-util package for your operating system.
This tool is required so that CTDB will be able to successfully trigger lock recovery after an ip address failover/failback.
This tool must be installed as /usr/sbin/sm-notify on all nodes in the cluster.
/etc/exports
Export the same directory from all nodes.
Make sure to specify the fsid export option so that all nodes will present the same fsid to clients.
Clients can get "upset" if the fsid on a mount suddenly changes.
Example /etc/exports :
/gpfs0/data *(rw,fsid=1235)
/etc/sysconfig/nfs
This file must be edited to point statd to keep its state directory on
shared storage instead of in a local directory.
We must also make statd use a fixed port to listen on that is the same for
all nodes in the cluster.
If we don't specify a fixed port, the statd port will change during failover
which causes problems on some clients.
(some clients are very slow to realize when the port has changed)
This file should look something like :
CTDB_MANAGES_NFS=yes
CTDB_MANAGES_NFSLOCK=yes
STATD_SHARED_DIRECTORY=/gpfs0/nfs-state
STATD_HOSTNAME=\"ctdb -P $STATD_SHARED_DIRECTORY/192.168.1.1 -H /etc/ctdb/statd-callout -p 97\"
The CTDB_MANAGES_NFS line tells the events scripts that CTDB is to manage startup and shutdown of the NFS and NFSLOCK services.
The CTDB_MANAGES_NFSLOCK line tells the events scripts that CTDB is also to manage the nfs lock manager.
With these set to yes, CTDB will start/stop/restart these services as required.
STATD_SHARED_DIRECTORY is the shared directory where statd and the statd-callout script expects that the state variables and lists of clients to notify are found.
This directory must be stored on the shared cluster filesystem so that all nodes can access the same data.
Don't forget to create this directory:
mkdir /gpfs0/nfs-state
chkconfig
Since CTDB will manage and start/stop/restart the nfs and the nfslock services, you must disable them using chkconfig.
chkconfig nfs off
chkconfig nfslock off
Event scripts
CTDB clustering for NFS relies on two event scripts /etc/ctdb/events.d/nfs and /etc/ctdb/events.d/nfslock.
These two scripts are provided by the RPM package and there should not be any need to change them.
IMPORTANT
Never ever mount the same nfs share on a client from two different nodes in the cluster at the same time!
The client side caching in NFS is very fragile and assumes/relies on that an object can only be accessed through one single path at a time.