diff options
author | Chuck Lever <chuck.lever@oracle.com> | 2009-05-18 11:08:53 -0400 |
---|---|---|
committer | Steve Dickson <steved@redhat.com> | 2009-05-18 11:08:53 -0400 |
commit | 5d253e3e326bfcf0e8a342bca53f1b4db120a7a9 (patch) | |
tree | 54dba165f400d6ee912710bc5cb44affb6928eb0 /tools | |
parent | 3ab7ab5db0f825fdd95d017cdd6d6ee5d207dbe8 (diff) | |
download | nfs-utils-5d253e3e326bfcf0e8a342bca53f1b4db120a7a9.tar.gz nfs-utils-5d253e3e326bfcf0e8a342bca53f1b4db120a7a9.tar.xz nfs-utils-5d253e3e326bfcf0e8a342bca53f1b4db120a7a9.zip |
sm-notify: Failed DNS lookups should be retried
Currently, if getaddrinfo(3) fails when trying to resolve a hostname,
sm-notify gives up immediately on that host. If sm-notify is started
before network service is available on a system, that means it quits
without notifying anyone. Or, if DNS service isn't available due to
a network partition or because the DNS server crashed, sm-notify will
simply remove all of its callback files and exit.
Really, sm-notify should try harder. We know that the hostnames
passed in to notify_host() have already been vetted by statd, which
won't monitor a hostname that it can't resolve. So it's likely that
any DNS failure we meet here is a temporary condition. If it isn't,
then sm-notify will stop trying to notify that host in 15 minutes
anyway.
[ The host's file is left in /var/lib/nfs/sm.bak in this case, but
sm.bak is not read again until the next time sm-notify runs. ]
sm-notify already has retry logic for handling RPC timeouts. We can
co-opt that to drive DNS resolution retries.
We also add AI_ADDRCONFIG because on systems whose network startup is
handled by NetworkManager, there appears to be a bug that causes
processes that started calling getaddinfo(3) before the network came
up to continue getting EAI_AGAIN even after the network is fully
operating.
As I understand it, legacy glibc (before AI_ADDRCONFIG was exposed in
headers) sets AI_ADDRCONFIG by default, although I haven't checked
this. In any event, pre-glibc-2.2 systems probably won't run
NetworkManager anyway, so this may not be much of a problem for them.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>
Diffstat (limited to 'tools')
0 files changed, 0 insertions, 0 deletions