summaryrefslogtreecommitdiffstats
path: root/ctdb/config
Commit message (Collapse)AuthorAgeFilesLines
* ctdb-eventscripts: New configuration variable CTDB_GANESHA_REC_SUBDIRMartin Schwenke2014-06-111-3/+5
| | | | | | | | | | | | | | Backup and restore of the cluster filesystem can upset the operation of 60.ganesha by changing the contents of this subdirectory. Allow this subdirectory to be configured to a subdirectory that is ignored by backup and restore processes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Jun 11 09:29:22 CEST 2014 on sn-devel-104
* ctdb-eventscripts: Add check for invalid policy routing configurationMartin Schwenke2014-05-051-0/+5
| | | | | | | | | | | | | The range CTDB_PER_IP_ROUTING_TABLE_ID_LOW..CTDB_PER_IP_ROUTING_TABLE_ID_HIGH should not include 253-255. Otherwise policy routing may overwrite the default system routing tables. Add some corresponding tests. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-eventscripts: Update comment in 11.routingMartin Schwenke2014-05-051-8/+13
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-eventscripts: Don't check if $iface is emptyMartin Schwenke2014-05-051-13/+15
| | | | | | | | | | | | | This is the loop variable. It can't be empty, especially given the way the list is built. This must have survived from an earlier version of the script. Given that there are whitespace changes associated with the above, clean-up the "virtio_net" avoidance check so that it reads less like line-noise. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-eventscripts: CTDB_NATGW_PUBLIC_* optional on slave-only nodesMartin Schwenke2014-04-141-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Commit 4ee4925d416a86341bd76c11fa99ec9173682a1d forgot about CTDB_NATGW_SLAVE_ONLY so it introduces an incorrect failure when this is set, and CTDB_NATGW_PUBLIC_IFACE or CTDB_NATGW_PUBLIC_IP is unset. Relax the sanity check to see if CTDB_NATGW_SLAVE_ONLY is set. Update the documentation to explicitly state that CTDB_NATGW_PUBLIC_IFACE and CTDB_NATGW_PUBLIC_IP are optional and unused if CTDB_NATGW_SLAVE_ONLY is set. It would be possible to insist that CTDB_NATGW_PUBLIC_IFACE and CTDB_NATGW_PUBLIC_IFACE should be unset in that case. However, it is more reasonable to allow consistent configuration across nodes except with some nodes configured slave-only. Add tests, update infrastructure and fix a thinko in the stub's "natgwlist" implementation. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Mon Apr 14 06:06:49 CEST 2014 on sn-devel-104
* ctdb-eventscripts: CTDB_NATGW_STATIC_ROUTES can specify gatewaysMartin Schwenke2014-03-261-7/+15
| | | | | | | Extend CTDB_NATGW_STATIC_ROUTES so that each network can have an optional gateway that overrides CTDB_NATGW_DEFAULT_GATEWAY. Signed-off-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: New configuration variable CTDB_NATGW_STATIC_ROUTESMartin Schwenke2014-03-261-3/+12
| | | | | | | This can be used to create more specific NATGW routes than the usual NATGW default route. Signed-off-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: Clarify that CTDB_NATGW_DEFAULT_GATEWAY is optionalMartin Schwenke2014-03-261-1/+3
| | | | | | | | | This has been implied since the command to add the route has had errors redirected to /dev/null. If infrastucture (e.g. ADS, DNS) is on the same network as CTDB_NATGW_PUBLIC_IP then no route is necessary. Signed-off-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: Improve check in NATGW "startup" eventMartin Schwenke2014-03-261-2/+5
| | | | | | | | | | | | Although the dots in $CTDB_NATGW_PUBLIC_IP could probably only help match an invalid public IP address, this is only executed once so do as exact a check as possible. Use CTDB_BASE instead of hardcoding /etc/ctdb. Make the error message less redundant. Signed-off-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: Reformat natgw_clear()Martin Schwenke2014-03-261-9/+11
| | | | Signed-off-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: Rename some NAT gateway functionsMartin Schwenke2014-03-261-10/+11
| | | | | | | | delete_all() really needed renaming for clarity. While doing this, might as well rename some of the others that don't start with "natgw_". Signed-off-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: Sanity check NAT gateway configurationMartin Schwenke2014-03-261-3/+20
| | | | | | | | | | NAT gateway really can't operate unless most of the configuration variables are set. A check in delete_all() can be removed - strange that this isn't also done in the add case. Signed-off-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: Improve readability of NAT gateway update codeMartin Schwenke2014-03-261-16/+31
| | | | | | Put the code into a couple of usefully named functions. Signed-off-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: Use set_proc() to update /procMartin Schwenke2014-03-261-3/+3
| | | | | | In case we want to write some unit tests in the future. Signed-off-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: Fix regression in IP add/delete functionsMartin Schwenke2014-03-231-4/+8
| | | | | | | | Commit 176ae6c704528c021fcc34a41878584f43a00119 caused these functions to exit on failure. This is incorrect and broke NAT gateway. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-eventscripts: Attach to persistent ctdb.tdb in "startup" eventMartin Schwenke2014-03-231-1/+2
| | | | | | | | | | | | | "statd-callout notify" currently complains until an add-client or del-client is done. Given that we might use ctdb.tdb for something else in the future it makes sense attach to it in the "startup" event. This could be done in the background but it should be so lightweight that a timeout will indicate serious problems. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-eventscripts: Switch on dumping of stuck nfsd threadsMartin Schwenke2014-02-251-1/+1
| | | | | | | | | | | | | | | | This feature was added quite a while ago but was not enabled by default. It is a useful feature so enable it to dump stack traces of up to 5 stuck processes by default. This can be disabled by setting: CTDB_NFS_DUMP_STUCK_THREADS=0 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Feb 25 04:06:45 CET 2014 on sn-devel-104
* ctdb-scripts: Update a misleading commentMartin Schwenke2014-02-191-8/+1
| | | | | | | | | | | | This comment was true when 50.samba was spaghetti because it tried to automatically manage both smbd (and nmbd) and winbind. It isn't true anymore. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Feb 19 04:07:12 CET 2014 on sn-devel-104
* ctdb-scripts: Enhancements to hung script debuggingMartin Schwenke2014-02-191-2/+32
| | | | | | | | | | | | | | | | | | * Add stack dumps for "interesting" processes that sometimes get stuck, so try to print stack traces for them if they appear in the pstree output. * Add new configuration variables CTDB_DEBUG_HUNG_SCRIPT_LOGFILE and CTDB_DEBUG_HUNG_SCRIPT_STACKPAT. These are primarily for testing but the latter may be useful for live debugging. * Load CTDB configuration so that above configuration variables can be set/changed without restarting ctdbd. Add a test that tries to ensure that all of this is working. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-eventscripts: Deleting IPs should use the promote_secondaries optionMartin Schwenke2014-02-132-71/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a primary IP address is being deleted from an interface, the secondaries are remembered and added back after the primary is deleted. This is done under a lock shared by the add/del script code. It is necessary because, by default, Linux deletes secondaries when the corresponding primary is deleted. There is a race here between ctdbd and the scripts, since ctdbd doesn't know about the lock. If ctdbd receives a release IP control and the IP address is not on an interface then it is regarded as a "Redundant release of IP" so no "releaseip" event is generated. This can occur if the IP address in question is a secondary that has been temporarily dropped. It is more likely if the number of secondaries is large. Since Linux 2.6.12 (i.e. 2005) Linux has supported a promote_secondaries option on interfaces. This option is currently undocumented but that will change in Linux 3.14. With promote_secondaries enabled the kernel will not drop secondaries but will promote a corresponding secondary instead. The kernel does all necessary locking. Use promote_secondaries to simplify the code, avoid re-adding secondaries, avoid re-adding routes and provide improved performance. This could be done conditionally, with a fallback to legacy secondary-re-adding code, but no supported Linux distribution is running a pre-2.6.12 kernel so this is unnecessary. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-eventscripts: Create extra files for ganesha recoverySrikrishan Malik2014-02-121-0/+2
| | | | | | | | | | This adds new files for Ganesha's recovery. myreleaseip_* are used by the recovery thread on the node where IP is released. The releaseip_* and tekeip_* files are used by recovery thread where IP is taken over. Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: Run mmlsconfig only once and use cached resultsSrikrishan Malik2014-02-121-2/+20
| | | | | | Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-eventscripts: Do not mark node unhealthy if no fs is availableSrikrishan Malik2014-01-301-3/+4
| | | | | | | | | Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com> Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net> Autobuild-User(master): Martin Schwenke <martins@samba.org> Autobuild-Date(master): Thu Jan 30 11:18:19 CET 2014 on sn-devel-104
* ctdb/eventscripts: Move all eventscript state under $CTDB_VARDIR/stateMartin Schwenke2014-01-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | Services can be flagged for reconfigure when they release IPs at shutdown. The flag is never removed and the service is prematurely reconfigured during the first "ipreallocated" event, before any IPs are hosted and before the "startup" event has actually started the services. $CTDB_VARDIR/state directly contained the service state subdirectories and is already removed in the "init" event. Just push the service state subdirectories down a level and put everything else in a subdirectory. This way all the eventscript state gets cleaned up every time CTDB starts up. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jan 17 09:58:26 CET 2014 on sn-devel-104
* ctdb/eventscripts: Print a count if killing TCP connections times outMartin Schwenke2014-01-171-1/+1
| | | | | | | Also update related test Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb/eventscripts: Reconfigure lock should be released quicklyMartin Schwenke2014-01-171-2/+12
| | | | | | | | | | | | | | | Currently the lock is held until the corresponding eventscript completes, since the process still exists. If the regular part of an eventscript hangs then the lock might unnecessarily be held for a long time. The pathological case is when a monitor event gets stuck in D-wait state and the script times out but can't be killed so the lock is still held. This can cause an unwanted monitor replay. Change this so that the lock is released immediately after the reconfiguration is complete. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb/eventscripts: Do not reconfigure in "monitor" eventsMartin Schwenke2013-12-171-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "monitor" events can be cancelled. If a reconfigure action does a service restart then the "monitor" event can be cancelled at the inconvenient moment after the service is stopped. In this case the service stays down and the node may become unhealthy (depending on whether there are any repair actions in the monitor event). A long time ago we did service reconfiguration in "monitor" events following failovers. Service reconfiguration was then moved to the "ipreallocated" event. However, reconfiguration in "monitor" events has been kept as a last resort in case an "ipreallocate" event does not occur. The only important case that this covers is "ctdb deleteip", where "releaseip" events are generated without a corresponding "ipreallocated". Therefore, IPs can be deleted without running the required service reconfiguration. The supported way of removing IP addresses is now via "ctdb reloadips", which always causes a takeover run with a corresponding "ipreallocate" event. This means that service reconfiguration in "monitor" events is no longer required and should be removed because it is unsafe. Also update the associated tests. Make the first confirm that the monitor event no longer does reconfiguration. Change the others to test that monitor status is correctly replayed when something else is doing a reconfigure and currently holds the reconfigure lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Dec 17 06:32:35 CET 2013 on sn-devel-104
* ctdb-scripts: Be careful when generating unique pids for stack tracesAmitay Isaacs2013-11-271-1/+1
| | | | | | | sort expects the data to be line based, so make it so. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-config: Simplify the default CTDB configuration fileAmitay Isaacs2013-11-271-322/+19
| | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-programmed-with: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-scripts: Replace hard-coded /var/ctdb with CTDB_VARDIRAmitay Isaacs2013-11-271-2/+2
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-scripts: Set defaults for CTDB_DBDIR and CTDB_DBDIR_PERSISTENTAmitay Isaacs2013-11-271-0/+5
| | | | | | | | | If these configuration variables are not defined, then there should a default fallback. This is a workaround till CTDB compile time configuration can be accessed at runtime. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-eventscripts: Perform share check before NFS RPC checks in 60.ganeshaAmitay Isaacs2013-11-271-6/+6
| | | | | | | | If NFS RPC checks do restart Ganesha, then it's possible that share check can fail prematurely. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-scripts: Add an early exit to statd-callout's notify caseMartin Schwenke2013-11-271-0/+1
| | | | | | | | If $statd_state is empty then the loop will run once and print spurious errors. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-eventscripts: Remove the nfs_statd_update() call from 60.ganeshaMartin Schwenke2013-11-271-4/+0
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-scripts: Run a single instance of debug_locks.sh at a give timeAmitay Isaacs2013-11-271-24/+34
| | | | | | | | | | This prevents spamming of logs if multiple lock requests are waiting and keep timing out. Also, improve the logging format with separators. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-scripts: Rewrite statd-callout to avoid 10 minute lagMartin Schwenke2013-11-272-111/+94
| | | | | | | | | This is naive and assumes no performance problems when updating persistent DBs. It also does no error handling. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-scripts: debug_locks.sh should use configuration to find TDB locationMartin Schwenke2013-11-271-2/+10
| | | | | | | | That is, don't use fixed paths. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org>
* eventscript: Fix link creation failure if the link already exist but the ↵Srikrishan Malik2013-11-011-1/+1
| | | | | | | | target path is missing Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com> (This used to be ctdb commit 370022e1ff654db99d0c3ce0c49914c249e57289)
* eventscripts: Rewrite the smb.conf cache file handlingMartin Schwenke2013-10-291-77/+48
| | | | | | | | | | | | | | | | | | | | | | | The background update is never guaranteed to complete before the cache is used, so don't bother trying it at the beginning. Instead, put a timeout on a foreground update. If the foreground update fails: * If there's no available cache file then die. * If there is a previous cache file then use it and log a warning. * Do a background update at the end of the monitor event. Also remove commas in the "smb ports" list before use, since (newer?) testparm seem to insert commas into the default value. Update the associated test to add a comma. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8c6f511254ecb0381a609b37e3a0ee6e5ec5d562)
* initscript: Update systemd configuration to put PID file in /run/ctdbMartin Schwenke2013-10-251-3/+3
| | | | | | | | | | Elsewhere we're moving the socket to /var/run/ctdb. We might end up with PID files and sockets for other daemons later, so let's call the directory "ctdb" instead of "ctdbd". Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b63f6fd2d295c8e18cbf3420ab05fce07b727f31)
* build: Move the default CTDB socket from /tmp to /var/run/ctdbAmitay Isaacs2013-10-251-2/+2
| | | | | | | | | | | | | | | | | Use /var/run/ctdb/ctdbd.socket because there might be other daemons that need sockets in the future. The local daemons test code to create a link for the default convenience socket has to be removed because the link can't be created as a regular user in the new location. This should be OK since all calls to the ctdb tool in the test code should be wrapped in onnode. When debugging tests, a developer will have to set CTDB_SOCKET by hand. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-programmed-with: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit dc67a4e24af9d07aead2a1710eeaf5d6cc409201)
* Add missing $remote_fs LSB dependencyMathieu Parent2013-10-241-2/+2
| | | | (This used to be ctdb commit a0b965bb73777dde7a4abf80c5c4742581bce520)
* eventscripts: Instead of listing all tunables, query EventScriptTimeoutAmitay Isaacs2013-10-241-1/+1
| | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 58ca2c3e7e3a27023ad86660f01a2052e2a19635)
* initscript: New configuration variable CTDB_DBDIR_STATEMartin Schwenke2013-10-221-0/+1
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 30d9b634b16c3cc740e5e453ea5c21012b1fde88)
* scripts: Make detect_init_style() more readableMartin Schwenke2013-10-221-2/+3
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 516cdea0e73cf3f63b3303e22809834c8cbc64e4)
* eventscripts: Rework the iSCSI eventscriptMartin Schwenke2013-10-221-11/+13
| | | | | | | | | | * It should run on "ipreallocated" instead of "recovered" * Variable name NODE -> ip since that's what it is * Simplify some logic Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 45e2bc66abf9fcfeadcc279a656ed7fd1838920a)
* eventscripts: Don't update static routes on "recovered" eventMartin Schwenke2013-10-221-1/+1
| | | | | | | | | | Routes only need to be updated when IPs have moved. IP takeover runs will generate "ipreallocated", which is enough. "recovered" always follows "ipreallocated" anyway, so avoid the redundancy. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 1152215fc69217e4292762e28d193b7ea0e06ee3)
* eventscripts: NAT gateway script doesn't need to handle "recovered" eventMartin Schwenke2013-10-221-9/+3
| | | | | | | | | | | Any time a node changes flags in any significant way there will be a takeover run, which will generate an "ipreallocated" event. The "recovered" event always happens straight after a takeover run so we update the NAT gateway twice. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 542c70d6281d636ecd51502fbbf219f418bfac66)
* eventscripts: Delete placeholder "recovered" and "shutdown" eventsMartin Schwenke2013-10-221-11/+0
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 00736a21fc268c10b6a718731e56b3dbb7e60554)
* eventscripts: Clean up comment at the top of 00.ctdbMartin Schwenke2013-10-221-9/+3
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2ea9d3acfe7e8665685f54294f5edc9b8ffc2f3f)