| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
delete_all() really needed renaming for clarity. While doing this,
might as well rename some of the others that don't start with
"natgw_".
Signed-off-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
|
|
| |
NAT gateway really can't operate unless most of the configuration
variables are set.
A check in delete_all() can be removed - strange that this isn't also
done in the add case.
Signed-off-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
| |
Put the code into a couple of usefully named functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
| |
In case we want to write some unit tests in the future.
Signed-off-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
| |
Commit 176ae6c704528c021fcc34a41878584f43a00119 caused these functions
to exit on failure. This is incorrect and broke NAT gateway.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"statd-callout notify" currently complains until an add-client or
del-client is done.
Given that we might use ctdb.tdb for something else in the future it
makes sense attach to it in the "startup" event. This could be done
in the background but it should be so lightweight that a timeout will
indicate serious problems.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This feature was added quite a while ago but was not enabled by
default. It is a useful feature so enable it to dump stack traces of
up to 5 stuck processes by default.
This can be disabled by setting:
CTDB_NFS_DUMP_STUCK_THREADS=0
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Feb 25 04:06:45 CET 2014 on sn-devel-104
|
|
|
|
|
|
|
|
|
|
|
|
| |
This comment was true when 50.samba was spaghetti because it tried to
automatically manage both smbd (and nmbd) and winbind. It isn't true
anymore.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Wed Feb 19 04:07:12 CET 2014 on sn-devel-104
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add stack dumps for "interesting" processes that sometimes get
stuck, so try to print stack traces for them if they appear in the
pstree output.
* Add new configuration variables CTDB_DEBUG_HUNG_SCRIPT_LOGFILE and
CTDB_DEBUG_HUNG_SCRIPT_STACKPAT. These are primarily for testing
but the latter may be useful for live debugging.
* Load CTDB configuration so that above configuration variables can be
set/changed without restarting ctdbd.
Add a test that tries to ensure that all of this is working.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a primary IP address is being deleted from an interface, the
secondaries are remembered and added back after the primary is
deleted. This is done under a lock shared by the add/del script code.
It is necessary because, by default, Linux deletes secondaries when
the corresponding primary is deleted.
There is a race here between ctdbd and the scripts, since ctdbd
doesn't know about the lock. If ctdbd receives a release IP control
and the IP address is not on an interface then it is regarded as a
"Redundant release of IP" so no "releaseip" event is generated. This
can occur if the IP address in question is a secondary that has been
temporarily dropped. It is more likely if the number of secondaries
is large.
Since Linux 2.6.12 (i.e. 2005) Linux has supported a
promote_secondaries option on interfaces. This option is currently
undocumented but that will change in Linux 3.14. With
promote_secondaries enabled the kernel will not drop secondaries but
will promote a corresponding secondary instead. The kernel does all
necessary locking.
Use promote_secondaries to simplify the code, avoid re-adding
secondaries, avoid re-adding routes and provide improved performance.
This could be done conditionally, with a fallback to legacy
secondary-re-adding code, but no supported Linux distribution is
running a pre-2.6.12 kernel so this is unnecessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
This adds new files for Ganesha's recovery. myreleaseip_* are used by
the recovery thread on the node where IP is released. The releaseip_*
and tekeip_* files are used by recovery thread where IP is taken over.
Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
| |
Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com>
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Jan 30 11:18:19 CET 2014 on sn-devel-104
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Services can be flagged for reconfigure when they release IPs at
shutdown. The flag is never removed and the service is prematurely
reconfigured during the first "ipreallocated" event, before any IPs
are hosted and before the "startup" event has actually started the
services.
$CTDB_VARDIR/state directly contained the service state subdirectories
and is already removed in the "init" event. Just push the service
state subdirectories down a level and put everything else in a
subdirectory.
This way all the eventscript state gets cleaned up every time CTDB
starts up.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Jan 17 09:58:26 CET 2014 on sn-devel-104
|
|
|
|
|
|
|
| |
Also update related test
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the lock is held until the corresponding eventscript
completes, since the process still exists. If the regular part of an
eventscript hangs then the lock might unnecessarily be held for a long
time. The pathological case is when a monitor event gets stuck in
D-wait state and the script times out but can't be killed so the lock
is still held. This can cause an unwanted monitor replay.
Change this so that the lock is released immediately after the
reconfiguration is complete.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"monitor" events can be cancelled. If a reconfigure action does a
service restart then the "monitor" event can be cancelled at the
inconvenient moment after the service is stopped. In this case the
service stays down and the node may become unhealthy (depending on
whether there are any repair actions in the monitor event).
A long time ago we did service reconfiguration in "monitor" events
following failovers. Service reconfiguration was then moved to the
"ipreallocated" event. However, reconfiguration in "monitor" events
has been kept as a last resort in case an "ipreallocate" event does
not occur. The only important case that this covers is "ctdb
deleteip", where "releaseip" events are generated without a
corresponding "ipreallocated". Therefore, IPs can be deleted without
running the required service reconfiguration.
The supported way of removing IP addresses is now via "ctdb
reloadips", which always causes a takeover run with a corresponding
"ipreallocate" event.
This means that service reconfiguration in "monitor" events is no
longer required and should be removed because it is unsafe.
Also update the associated tests. Make the first confirm that the
monitor event no longer does reconfiguration. Change the others to
test that monitor status is correctly replayed when something else is
doing a reconfigure and currently holds the reconfigure lock.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Dec 17 06:32:35 CET 2013 on sn-devel-104
|
|
|
|
|
|
|
| |
sort expects the data to be line based, so make it so.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
| |
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
| |
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
|
| |
If these configuration variables are not defined, then there should
a default fallback. This is a workaround till CTDB compile time
configuration can be accessed at runtime.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
| |
If NFS RPC checks do restart Ganesha, then it's possible that share
check can fail prematurely.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
| |
If $statd_state is empty then the loop will run once and print
spurious errors.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
|
|
| |
This prevents spamming of logs if multiple lock requests are waiting
and keep timing out.
Also, improve the logging format with separators.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
|
| |
This is naive and assumes no performance problems when updating
persistent DBs. It also does no error handling.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
| |
That is, don't use fixed paths.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
| |
target path is missing
Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com>
(This used to be ctdb commit 370022e1ff654db99d0c3ce0c49914c249e57289)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The background update is never guaranteed to complete before the cache
is used, so don't bother trying it at the beginning. Instead, put a
timeout on a foreground update.
If the foreground update fails:
* If there's no available cache file then die.
* If there is a previous cache file then use it and log a warning.
* Do a background update at the end of the monitor event.
Also remove commas in the "smb ports" list before use, since (newer?)
testparm seem to insert commas into the default value. Update the
associated test to add a comma.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 8c6f511254ecb0381a609b37e3a0ee6e5ec5d562)
|
|
|
|
|
|
|
|
|
|
| |
Elsewhere we're moving the socket to /var/run/ctdb. We might end up
with PID files and sockets for other daemons later, so let's call the
directory "ctdb" instead of "ctdbd".
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit b63f6fd2d295c8e18cbf3420ab05fce07b727f31)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use /var/run/ctdb/ctdbd.socket because there might be other daemons
that need sockets in the future.
The local daemons test code to create a link for the default
convenience socket has to be removed because the link can't be created
as a regular user in the new location. This should be OK since all
calls to the ctdb tool in the test code should be wrapped in onnode.
When debugging tests, a developer will have to set CTDB_SOCKET by
hand.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit dc67a4e24af9d07aead2a1710eeaf5d6cc409201)
|
|
|
|
| |
(This used to be ctdb commit a0b965bb73777dde7a4abf80c5c4742581bce520)
|
|
|
|
|
|
| |
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 58ca2c3e7e3a27023ad86660f01a2052e2a19635)
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 30d9b634b16c3cc740e5e453ea5c21012b1fde88)
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 516cdea0e73cf3f63b3303e22809834c8cbc64e4)
|
|
|
|
|
|
|
|
|
|
| |
* It should run on "ipreallocated" instead of "recovered"
* Variable name NODE -> ip since that's what it is
* Simplify some logic
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 45e2bc66abf9fcfeadcc279a656ed7fd1838920a)
|
|
|
|
|
|
|
|
|
|
| |
Routes only need to be updated when IPs have moved. IP takeover runs
will generate "ipreallocated", which is enough. "recovered" always
follows "ipreallocated" anyway, so avoid the redundancy.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 1152215fc69217e4292762e28d193b7ea0e06ee3)
|
|
|
|
|
|
|
|
|
|
|
| |
Any time a node changes flags in any significant way there will be a
takeover run, which will generate an "ipreallocated" event. The
"recovered" event always happens straight after a takeover run so we
update the NAT gateway twice.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 542c70d6281d636ecd51502fbbf219f418bfac66)
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 00736a21fc268c10b6a718731e56b3dbb7e60554)
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 2ea9d3acfe7e8665685f54294f5edc9b8ffc2f3f)
|
|
|
|
|
|
|
|
|
| |
There is no reconfigure code for these scripts so no need to check for
reconfiguration.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 41df1637c1d8a7b2f5a9974408db71b1f74cb2f2)
|
|
|
|
|
|
|
|
|
| |
Nothing ever (or has ever) set the "needs reconfigure" flag, so this
code is unnecessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 5b77fd95bda5f1960aca952e1b759231890b56f3)
|
|
|
|
|
|
|
|
|
| |
A generic framework is no longer needed now that the "ctdb" checker is
the only one left. Simplify the code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 044d302b41a2040642355401e3236fcecc3a620a)
|
|
|
|
|
|
|
|
|
|
|
| |
"ctdb checktcpport" is no longer experimental so the other checkers
are no longer required.
Remove tests related to the removed checkers.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 50e330d0679614bee2e7bab028436e929f74ca50)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current setting is inconsistent with settings on most systems,
putting /bin before /sbin. Use of /usr/local/bin, which may be
required on some systems, is also overridden. This can make it
difficult to do interactive debugging of script problems.
Rely on the system PATH instead.
If system-specific changes need to be made then this can be done in a
configuration file.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit cfbff39e22e42f3997f637290748290833525714)
|
|
|
|
|
|
|
|
| |
Reduce the complexity, including the depth of background processes.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 49f077c475b078889ff0492fe7d567a64d6cb87c)
|
|
|
|
|
|
|
|
|
|
| |
Otherwise calls to "ctdb natgwlist" will not behave as expected if a
non-standard file is used, since that command will use the default
file location.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit e574b30257126679704b088c4334a8e7a53a9c3f)
|
|
|
|
|
|
|
|
|
| |
The old logic was actually wrong. If CTDB_LOGFILE is unset then a
default is used, not syslog.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 79e2029f9bc078126e865aa715100a3870c7604b)
|
|
|
|
|
|
|
|
|
|
|
| |
Allowing people to put random options in CTDB_OPTIONS complicates some
logic (particularly around use of syslog). If we're going to have
variables for options then let's make sure we have a variable for each
option and make people use them.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit e55f3a1577eff0182802b0341d865d961aeae1c7)
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit bda0da41aaf629a252cc361b73ebc5328f26ed04)
|