| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, if getaddrinfo(3) fails when trying to resolve a hostname,
sm-notify gives up immediately on that host. If sm-notify is started
before network service is available on a system, that means it quits
without notifying anyone. Or, if DNS service isn't available due to
a network partition or because the DNS server crashed, sm-notify will
simply remove all of its callback files and exit.
Really, sm-notify should try harder. We know that the hostnames
passed in to notify_host() have already been vetted by statd, which
won't monitor a hostname that it can't resolve. So it's likely that
any DNS failure we meet here is a temporary condition. If it isn't,
then sm-notify will stop trying to notify that host in 15 minutes
anyway.
[ The host's file is left in /var/lib/nfs/sm.bak in this case, but
sm.bak is not read again until the next time sm-notify runs. ]
sm-notify already has retry logic for handling RPC timeouts. We can
co-opt that to drive DNS resolution retries.
We also add AI_ADDRCONFIG because on systems whose network startup is
handled by NetworkManager, there appears to be a bug that causes
processes that started calling getaddinfo(3) before the network came
up to continue getting EAI_AGAIN even after the network is fully
operating.
As I understand it, legacy glibc (before AI_ADDRCONFIG was exposed in
headers) sets AI_ADDRCONFIG by default, although I haven't checked
this. In any event, pre-glibc-2.2 systems probably won't run
NetworkManager anyway, so this may not be much of a problem for them.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sm-notify orphans an addrinfo struct in its address list rotation
logic if only a single result was returned from getaddrinfo(3).
For each host, the first time through notify_host(), we want to
send a PMAP_GETPORT request. ->ai is NULL, and retries is set to 100,
forcing a DNS lookup and an address rotation. If only a single
addrinfo struct is returned, the rotation logic causes a NULL to be
planted in ->ai, copied from the ai_next field of the returned result.
This means that the second time through notify_host() (to perform the
actual SM_NOTIFY call) we do a second DNS lookup, since ->ai is NULL.
The result of the first lookup has been orphaned, and extra network
traffic is generated.
This scenario is actually fairly common. Since we pass
.ai_protocol = IPPROTO_UDP,
to getaddrinfo(3), for most hosts, which have a single forward and
reverse pointer in the DNS database, we get back a single addrinfo
struct as a result.
To address this problem, only perform the address list rotation if
there is more than one element on the list returned by getaddrinfo(3).
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
|
| |
flag has been set. This cause warnings to be generated when
return values from reads/writes (and other calls) are not
checked. The patch address those warnings.
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The recv_reply() function was referencing host->ai in a freeaddrinfo(3)
call after it had freed @host.
This is not likely to be harmful in a single-threaded user context,
but it's still bad form, and it will get called out if testing
sm-notify with poisoned free memory. The less noise, the better we
are able to see real problems.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
|
| |
Added curly brackets around the record_pid() check which
stop sm-notify from exiting when a pid file does not
exist.
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
| |
there are no hosts to notify. This also decreases
start up time by a few seconds.
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up.
The sm-notify command is built from a single source file.
Some of its internal functions are appropriately defined as static.
However, some are declared static, but defined as global. Some are
declared and defined as global. None of them are used outside of
utils/statd/sm-notify.c.
Make all the internal functions in utils/statd/sm-notify.cstatic.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Clean up: replace "typedef struct sockaddr_storage nsm_address" with
standard socket address types. This makes sm-notify.c consistent with other
parts of nfs-utils, and with typical network application coding conventions.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up a few issues with logging in sm-notify.c.
Sometimes in sm-notify, when a system call fails the problem is reported
to stderr but not logged, and then usually sm-notify exits. In cases like
this, there are probably more hosts to notify, but sm-notify dies silently.
Make sure these errors are logged, and that the log messages explain the
nature of the problem.
Also, if sm-notify exits prematurely, make sure this is always reported at
the LOG_ERR level, not at the LOG_WARNING level.
Remove a couple of unnecessary '\n' in the arguments of nsm_log() calls --
nsm_log() already appends an '\n' to the message.
Finally, use exit() consistently in main().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure the results of getaddrinfo(3) are properly freed in notify().
Note this is a one-time addrinfo allocation that would be automatically
freed when sm-notify exits anyway, so this is more of a nit than a real
bug fix.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
|
| |
Clean up: Include config.h as other source files do; instead of using
"config.h" use the HAVE_CONFIG_H macro and include <config.h>.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an NFS server has no network connectivity when it reboots,
it will block in sm-notify waiting for DNS lookup for a potentially
large number of hosts. This is not helpful and just annoys the
sysadmin.
So do the DNS lookup in the backgrounded phase of sm-notify,
before sending off the NOTIFY requests.
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Steve Dickson <steved@redhat.com>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ports < 1024 are a scarce resource and should not be used
carelessly. Technically they should be not used at all without
registration with IANA, but sometimes we need them despite that.
So: for the socket that RPC services listen on, don't use a <1024 port
by default. There is no need.
For sockets that we send messages on, that are long-lived, and that might
need to appear 'privileged', avoid using a number that is registered in
/etc/services if possible.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both SM_STAT and SM_MON can return the state of an NSM, but it is
unclear which NSM they return the state of, so the value cannot be
used, and lockd doesn't use it.
Document this confusion, and give the current state to the kernel
via a sysctl if that sysctl is available (since about 2.6.19).
This should make is possible for the NFS server to detect a small
class of bad SM_NOTIFY packets and not flush locks in that case.
Signed-off-by: Neil Brown <neilb@suse.de>
|
|
|
|
|
|
|
| |
When sending an SM_NOTIFY to multi-homed host, try all the addresses
in rotation. After 4 failures on one address, try the next.
Signed-off-by: Neil Brown <neilb@suse.de>
|
|
|
|
|
|
| |
Make sure that sm-notify really runs only once per reboot.
Signed-off-by: Neil Brown <neilb@suse.de>
|
| |
|
| |
|
|
|
|
|
| |
Add sm-notify to the compile/install scripts,
(and fix a compile warning).
|
|
|
|
|
| |
If /var/lib/nfs/sm is owned by non-root, setuid to that uid
after opening sockets but before receiving answers.
|
|
|
|
|
|
|
|
|
| |
As "mount.nfs" can start statd, and as statd can start sm-notify,
the risk of sm-notify being run multiple times increases.
As this is not normally appropriate, sm-notify now creates a
file in /var/run which will stop future instances from being
run (though ofcourse this behaviour can be controlled by a
new command line option).
|
|
|
|
|
| |
This functionality is alreday present in getaddrinfo so it isn't
needed explicitly.
|
|
|
|
| |
for compat with statd.
|
|
Not included in build yet.
|