summaryrefslogtreecommitdiffstats
path: root/ctdb/config/functions
Commit message (Collapse)AuthorAgeFilesLines
* ctdb-scripts: Remove unused function nfs_statd_update()Martin Schwenke2015-03-041-17/+0
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-scripts: Change statd-callout to be more scalableMartin Schwenke2015-03-041-0/+10
| | | | | | | | | | | | | | | | Updating ctdb.tdb on each add-client, del-client and each delete during notify was too ambitious. Persistent transactions do not perform well enough to do this. Revert to having add-client and del-client create touch files. Each monitor event calls "statd-callout update" to convert touch files into ctdb.tdb records. Update testcases to do the "update" and add an extra test. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-scripts: Call iptables/ip6tables directly from iptables_wrapperMartin Schwenke2015-01-281-11/+5
| | | | | | | | | | | | | | | | | | | | | | Drops the iptables() and ip6tables() functions and, hence, the hardcoding of paths /sbin/iptables and /sbin/ip6tables. The latter avoids problems on openSUSE where (for example) /usr/sbin/iptables is used instead. This means that locking around ip*tables commands is only done when iptables_wrapper is called directly. This is fine because the only conflict is when "releaseip" or "takeip"/"updateip" events are run in parallel. The other uses in 11.natgw and 70.iscsi are in events where there will be no collisions. Making 11.natgw support IPv6 is unnecessary. Just put a static IPv6 address on each interface - they're plentiful. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Jan 28 08:29:55 CET 2015 on sn-devel-104
* ctdb-scripts: Don't use the GNU awk gensub() functionMartin Schwenke2015-01-091-3/+4
| | | | | | | | | | | | | This is a gawk extension and can't be used reliably if just running "awk". It is simple enough to switch to using the standard sub() and gsub() functions. The alternative is to switch to explicitly running "gawk". However, although the eventscripts aren't exactly portable, it is probably better to move closer to portability than further away. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-scripts: Try to deal with Ubuntu having /usr/sbin/serviceMartin Schwenke2015-01-091-0/+2
| | | | | | | | | Falling back to running the initscript doesn't work because it detects that upstart is being used and fails. This was observed when trying to start winbind on Ubuntu 11.04. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org>
* ctdb-scripts: Wait until IPv6 addresses are not "tentative"Martin Schwenke2014-12-051-0/+23
| | | | | | | | | | | | | | | There are a few potential failure modes when adding an IPv6 address. It takes a little while of duplicate address detection to complete, so wait for a while. After a timeout, also need to check to see if duplicate address detection failed - if it did then actually drop the IP address. This really needs some careful thinking. If CTDB disappears on a node but the node's IP addresses are still on interfaces then the above failure mode could cause the takeover nodes to become banned. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-eventscripts: Specify broadcast optionally to ip addr addAmitay Isaacs2014-12-051-1/+7
| | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Martin Schwenke <martin@meltin.net>
* ctdb-scripts: Make 10.interface IPv6-safeMartin Schwenke2014-12-051-0/+6
| | | | | | | | | | | | | | | | | | | Add checking to "releaseip" and "updateip" to ensure that the given IP address is really on the given interface with the given netmask. If reality doesn't match the given arguments then believe reality. Use new function iptables_wrapper() instead of calling iptables() directly. Use new function flush_route_cache() instead of doing IPv4-specific /proc magic. Remove setting of otherwise unused variable "failed". Fix a test for which the error message has changed. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-scripts: New functions ip6tables() and iptables_wrapper()Martin Schwenke2014-12-051-1/+14
| | | | | | | | | | | ip6tables() uses the same lock as iptables(). This is done on suspicion. iptables_wrapper() takes 1st argument "inet" or "inet6", and the rest is passed to the correct iptables variant. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-scripts: Add IPv6 addresses support in ip_maskbits_iface()Martin Schwenke2014-12-051-2/+9
| | | | | | | | It also prints a third word, the address family. This is either "inet" or "inet6". Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-scripts: Update eventscripts to use ctdb -X instead of ctdb -YMartin Schwenke2014-12-051-10/+10
| | | | | | | Also update associated eventscript unit tests and ctdb stub. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-scripts: Add rpc.statd stack dumping to Ganesha restartMartin Schwenke2014-11-181-1/+3
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-scripts: Dump stack traces for hung mountd, rquotad, statd processesMartin Schwenke2014-11-181-0/+3
| | | | | | | Add a corresponding new unit test for statd. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-scripts: Add optional program name argument to nfs_dump_some_threads()Martin Schwenke2014-11-181-1/+3
| | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-scripts: Factor out new function program_stack_traces()Martin Schwenke2014-11-181-16/+24
| | | | | | | In the process, fix a bug where an extra trace would be printed. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-logging: Add logging via UDP to 127.0.0.1:514 to syslog backendMartin Schwenke2014-10-281-0/+2
| | | | | | | | | This has most of the advantages of the old logd with none of the complexity of the extra process. There are several good syslog implementations that can listen on the UDP port. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-logging: New option CTDB_LOGGING, remove CTDB_LOGFILE, CTDB_SYSLOGMartin Schwenke2014-10-281-9/+17
| | | | | | | | | | | | Remove --logfile and --syslog daemon options and replace with --logging. Modularise and clean up logging initialisation code. The initialisation API includes an app_name argument that is currently unused - this will be used in extensions to the syslog backend. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-scripts: Support NFS on RHEL7 with systemdMartin Schwenke2014-07-071-2/+4
| | | | | | | | | Need to be able to recognise a RHEL system. Still use "system" to start and stop service, since that still works and yields the smallest change. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-eventscripts: Fix regression in IP add/delete functionsMartin Schwenke2014-03-231-4/+8
| | | | | | | | Commit 176ae6c704528c021fcc34a41878584f43a00119 caused these functions to exit on failure. This is incorrect and broke NAT gateway. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb-eventscripts: Switch on dumping of stuck nfsd threadsMartin Schwenke2014-02-251-1/+1
| | | | | | | | | | | | | | | | This feature was added quite a while ago but was not enabled by default. It is a useful feature so enable it to dump stack traces of up to 5 stuck processes by default. This can be disabled by setting: CTDB_NFS_DUMP_STUCK_THREADS=0 Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Feb 25 04:06:45 CET 2014 on sn-devel-104
* ctdb-scripts: Update a misleading commentMartin Schwenke2014-02-191-8/+1
| | | | | | | | | | | | This comment was true when 50.samba was spaghetti because it tried to automatically manage both smbd (and nmbd) and winbind. It isn't true anymore. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Wed Feb 19 04:07:12 CET 2014 on sn-devel-104
* ctdb-eventscripts: Deleting IPs should use the promote_secondaries optionMartin Schwenke2014-02-131-71/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a primary IP address is being deleted from an interface, the secondaries are remembered and added back after the primary is deleted. This is done under a lock shared by the add/del script code. It is necessary because, by default, Linux deletes secondaries when the corresponding primary is deleted. There is a race here between ctdbd and the scripts, since ctdbd doesn't know about the lock. If ctdbd receives a release IP control and the IP address is not on an interface then it is regarded as a "Redundant release of IP" so no "releaseip" event is generated. This can occur if the IP address in question is a secondary that has been temporarily dropped. It is more likely if the number of secondaries is large. Since Linux 2.6.12 (i.e. 2005) Linux has supported a promote_secondaries option on interfaces. This option is currently undocumented but that will change in Linux 3.14. With promote_secondaries enabled the kernel will not drop secondaries but will promote a corresponding secondary instead. The kernel does all necessary locking. Use promote_secondaries to simplify the code, avoid re-adding secondaries, avoid re-adding routes and provide improved performance. This could be done conditionally, with a fallback to legacy secondary-re-adding code, but no supported Linux distribution is running a pre-2.6.12 kernel so this is unnecessary. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb/eventscripts: Move all eventscript state under $CTDB_VARDIR/stateMartin Schwenke2014-01-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | Services can be flagged for reconfigure when they release IPs at shutdown. The flag is never removed and the service is prematurely reconfigured during the first "ipreallocated" event, before any IPs are hosted and before the "startup" event has actually started the services. $CTDB_VARDIR/state directly contained the service state subdirectories and is already removed in the "init" event. Just push the service state subdirectories down a level and put everything else in a subdirectory. This way all the eventscript state gets cleaned up every time CTDB starts up. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Fri Jan 17 09:58:26 CET 2014 on sn-devel-104
* ctdb/eventscripts: Print a count if killing TCP connections times outMartin Schwenke2014-01-171-1/+1
| | | | | | | Also update related test Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb/eventscripts: Reconfigure lock should be released quicklyMartin Schwenke2014-01-171-2/+12
| | | | | | | | | | | | | | | Currently the lock is held until the corresponding eventscript completes, since the process still exists. If the regular part of an eventscript hangs then the lock might unnecessarily be held for a long time. The pathological case is when a monitor event gets stuck in D-wait state and the script times out but can't be killed so the lock is still held. This can cause an unwanted monitor replay. Change this so that the lock is released immediately after the reconfiguration is complete. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
* ctdb/eventscripts: Do not reconfigure in "monitor" eventsMartin Schwenke2013-12-171-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "monitor" events can be cancelled. If a reconfigure action does a service restart then the "monitor" event can be cancelled at the inconvenient moment after the service is stopped. In this case the service stays down and the node may become unhealthy (depending on whether there are any repair actions in the monitor event). A long time ago we did service reconfiguration in "monitor" events following failovers. Service reconfiguration was then moved to the "ipreallocated" event. However, reconfiguration in "monitor" events has been kept as a last resort in case an "ipreallocate" event does not occur. The only important case that this covers is "ctdb deleteip", where "releaseip" events are generated without a corresponding "ipreallocated". Therefore, IPs can be deleted without running the required service reconfiguration. The supported way of removing IP addresses is now via "ctdb reloadips", which always causes a takeover run with a corresponding "ipreallocate" event. This means that service reconfiguration in "monitor" events is no longer required and should be removed because it is unsafe. Also update the associated tests. Make the first confirm that the monitor event no longer does reconfiguration. Change the others to test that monitor status is correctly replayed when something else is doing a reconfigure and currently holds the reconfigure lock. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Amitay Isaacs <amitay@gmail.com> Autobuild-User(master): Amitay Isaacs <amitay@samba.org> Autobuild-Date(master): Tue Dec 17 06:32:35 CET 2013 on sn-devel-104
* scripts: Make detect_init_style() more readableMartin Schwenke2013-10-221-2/+3
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 516cdea0e73cf3f63b3303e22809834c8cbc64e4)
* eventscripts: Fold ctdb_check_tcp_ports_ctdb() into ctdb_check_tcp_ports()Martin Schwenke2013-10-221-50/+16
| | | | | | | | | A generic framework is no longer needed now that the "ctdb" checker is the only one left. Simplify the code. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 044d302b41a2040642355401e3236fcecc3a620a)
* eventscripts: Remove TCP port checks other than the built-in CTDB oneMartin Schwenke2013-10-221-69/+1
| | | | | | | | | | | "ctdb checktcpport" is no longer experimental so the other checkers are no longer required. Remove tests related to the removed checkers. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 50e330d0679614bee2e7bab028436e929f74ca50)
* scripts: Remove setting of PATH from functions fileMartin Schwenke2013-10-221-2/+0
| | | | | | | | | | | | | | | | The current setting is inconsistent with settings on most systems, putting /bin before /sbin. Use of /usr/local/bin, which may be required on some systems, is also overridden. This can make it difficult to do interactive debugging of script problems. Rely on the system PATH instead. If system-specific changes need to be made then this can be done in a configuration file. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cfbff39e22e42f3997f637290748290833525714)
* scripts: Simplify script_log() to just look at CTDB_SYSLOG variableMartin Schwenke2013-10-221-6/+1
| | | | | | | | | The old logic was actually wrong. If CTDB_LOGFILE is unset then a default is used, not syslog. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 79e2029f9bc078126e865aa715100a3870c7604b)
* scripts: Remove support for CTDB_OPTIONS configuration variableMartin Schwenke2013-10-221-3/+0
| | | | | | | | | | | Allowing people to put random options in CTDB_OPTIONS complicates some logic (particularly around use of syslog). If we're going to have variables for options then let's make sure we have a variable for each option and make people use them. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e55f3a1577eff0182802b0341d865d961aeae1c7)
* scripts: Remove unused configuration variable CTDB_MANAGES_SCPMartin Schwenke2013-10-221-1/+0
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit bda0da41aaf629a252cc361b73ebc5328f26ed04)
* eventscripts: Fix comment - CTDB_TCP_PORT_CHECKS -> CTDB_TCP_PORT_CHECKERSMartin Schwenke2013-10-221-1/+1
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0a79ba2f1277a776347e2c3f04ce8419e0be62de)
* scripts: Add support for optional ctdbd.conf configuration fileMartin Schwenke2013-09-251-0/+7
| | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8f660d0dd52013e5876806be908e8e603aa6e968)
* eventscripts: Improve message logged when a counter hits a limitMartin Schwenke2013-08-141-1/+1
| | | | | | | | | It should print the actual number of consecutive failures rather than the limit. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ff5f0d1e29af2b293e30cdc54bed03a644be7038)
* eventscripts: Print a message when waiting for TCP connections to be killedMartin Schwenke2013-08-141-1/+4
| | | | | | | | This makes the gaps in the logs more obvious. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 11fbf4789d783dd0bac22754b374dd9ea4b03bad)
* eventscripts: New configuration variable $CTDB_RPCINFO_LOCALHOSTMartin Schwenke2013-08-141-1/+3
| | | | | | | | | | Passing "localhost" to the rpcinfo command causes overheads, like reading /etc/services multiple times. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1d61988af9e4fa3621a3e2d06a859bcb53df2d67)
* eventscripts: Add modulo (%) operator to ctdb_check_counter()Martin Schwenke2013-08-141-7/+12
| | | | | | | | Also add it to the corresponding eventscript unit test infrastructure. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f4ef83a256f59eeb00b9a5bc10c28347e1ad1031)
* eventscripts: Separate out RPC service restart codeMartin Schwenke2013-08-141-41/+56
| | | | | | | | | | | | | | | | | | While doing this: * Explicitly assign RPC program and version information in _nfs_check_rpc_common(). This is more lines of code but is easier to read. * Don't print the options when starting a service. Trying to print it makes the code messy for little benefit. Update the eventscript unit testing code and a Ganesha test to reflect this. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e8b531405665885196c95fe1608db33a255bf761)
* eventscripts: When restarting the nfslock service only show output of startMartin Schwenke2013-08-141-2/+2
| | | | | | | | | | | That is, /dev/null the "stop" output. This is consistent with the way CTDB generally deals with the output when stopping a service. It also makes updating the eventscript unit tests easier. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c7332526b1b488abefeb4be78a7cd3f2f9abc451)
* eventscripts: kill_tcp_connections() should send connections to stdinMartin Schwenke2013-07-291-11/+14
| | | | | | | | | | | | | | | This avoids issuing multiple "ctdb killtcp" commands to terminate tcp connections, one per connection. This will considerably reduce the time when there is a large number of tcp connections. This also makes it possible to avoid calling "ctdb killtcp" when there are no connections. Add a couple of unit tests for killtcp and update eventscript unit test infrastructure to support. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a20d94717d2e4ab866d8a002cdf39c0669b74c6a)
* eventscripts: When replaying monitor status, don't log empty outputMartin Schwenke2013-07-051-1/+3
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ce04f1c107b4392ca955d9f29b93aaaae62439ce)
* scripts: drop_ip() should use delete_ip_from_iface()Martin Schwenke2013-06-201-1/+1
| | | | | | | | | Otherwise secondary addresses that aren't owned by CTDB could be dropped. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5ffce65a1ad659b198ddf647622b899bdde45c72)
* scripts: drop_all_public_ips() now prints messages to stdout, not logMartin Schwenke2013-06-201-8/+2
| | | | | | | | Change all callers to maintain current behaviour. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b67397ef5419c781a35916575151da7b7e7cc27)
* eventscripts: New configuration varable $CTDB_NFS_DUMP_STUCK_THREADSMartin Schwenke2013-06-141-0/+24
| | | | | | | | | | | If some nfsd threads are still alive after a shutdown during a restart then this indicates the maximum number of threads for which a stack trace should be dumped. This can be useful for trying to determine why nfsd is stuck. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2503245db10d567af708a04edd3a3b488c24f401)
* eventscripts: Fix statd-callout update handlingMartin Schwenke2013-05-281-0/+17
| | | | | | | | | | | | | | 60.nfs and 60.ganesha touch $statd_update_trigger every time they're run. This stops the statd-callout updates from ever being called. Make this logic self-contained and move it to new function nfs_statd_update() in the functions file. Call this in 60.nfs and 60.ganesha with the appropriate update period as the only argument. Signed-off-by: Martin Schwenke <martin@meltin.net> Reported-by: Poornima Gupte <poornima.gupte@in.ibm.com> (This used to be ctdb commit 1b5968f6be084590667f4f15ff3bef13ed9a2973)
* scripts: Provide mktemp function for platforms without mktemp commandMartin Schwenke2013-05-271-0/+26
| | | | | | | | | | | This is needed for AIX and possibly others. Also provide a cheaper mktemp function is needed in the run_tests script. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b2b572e9049c7138bd223226475bef8fe3e01f10)
* eventscripts: Fix regression in _loadconfig()Martin Schwenke2013-05-221-1/+8
| | | | | | | | | | | fff88940f71058e4eefd65f50a6701389c005c17 introduced a regression. Without $service_name set by default, the CTDB configuration is no longer loaded when loadconfig() is called without any arguments. That's bad. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f1619a36c1beba11533052dc5728fa3adaa08870)
* eventscripts: NFS RPC checks no longer support "knfsd"Martin Schwenke2013-05-071-1/+1
| | | | | | | | No longer used, support removed from test infrastructure. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0eb351ff4c7ee096de7c5e0a59561067091fa32e)