summaryrefslogtreecommitdiffstats
path: root/ctdb/config/events.d
Commit message (Collapse)AuthorAgeFilesLines
* event scripts: add logging for low memory conditionsRusty Russell2010-02-091-0/+10
| | | | | | | | We should never enter swap; if we do, show the memory state of the machine and the process list. This will help us diagnose what caused the condition before it's too late and the box starts OOM-killing processes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 627a6d67a0e9e61f8713e62695b3518c51909230)
* config: 10.interface: search "ethtool" in $PATH instead of using a hardcoded ↵Stefan Metzmacher2010-01-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | path This is very useful for testing, I use such a script: cat ~/bin/ethtool #!/bin/sh IFACE=$1 case "$IFACE" in Neth2) ;; Neth3) ;; Neth4) ;; Neth5) ;; *) exec /usr/sbin/ethtool $@ ;; esac ip link set down $IFACE exec /usr/sbin/ethtool $@ metze (This used to be ctdb commit 3bab985cf615720eded4d47b4f9f37a9c28840aa)
* events: add updateip event to 13.per_ip_routingStefan Metzmacher2010-01-201-0/+60
| | | | | | metze (This used to be ctdb commit 829150e814a5e6c85d0f21421f46f41e81d74c53)
* events: 10.interface handle updateip eventStefan Metzmacher2010-01-201-0/+57
| | | | | | metze (This used to be ctdb commit a5cdf1277387f8c6292153c37fa9ceb64707d04f)
* server: add updateip eventStefan Metzmacher2010-01-201-0/+8
| | | | | | metze (This used to be ctdb commit 712ed0c4c0bff1be9e96a54b62512787a4aa6259)
* config: add CTDB_PARTIALLY_ONLINE_INTERFACES to ctdb.sysconfigStefan Metzmacher2010-01-201-0/+9
| | | | | | | | | With this option set to "yes", we don't become unhealthy as long as at least one interface is still available. metze (This used to be ctdb commit d054eb33c6ae92560cddb40732e5dcf622591a3c)
* config: 10.interfaces call monitor_interfaces on startupStefan Metzmacher2010-01-201-0/+8
| | | | | | metze (This used to be ctdb commit 615dec051c26aac628f120e96bf12fb39fc6d28a)
* config: 10.interfaces call ctdb ifaces and ctdb setifacelink for monitoringStefan Metzmacher2010-01-201-1/+46
| | | | | | metze (This used to be ctdb commit c465f63585c419ba59a6b04cbbf78ae615a7259d)
* events: splitout a monitor_interfaces function in 10.interfaceStefan Metzmacher2010-01-201-45/+64
| | | | | | metze (This used to be ctdb commit b5ba56dea57db97d6c6ba3e7582e74fe0e3041fc)
* events: 10.interfaces allow multiple interfaces per public addressStefan Metzmacher2010-01-201-1/+1
| | | | | | metze (This used to be ctdb commit f9837f8b6f887d28f29aeb3eeffe8cfb423b40b4)
* config: add 13.per_ip_routing event scriptStefan Metzmacher2010-01-201-0/+384
| | | | | | | | | With this script it's possible to generate routing tables per public ip address. metze (This used to be ctdb commit ff5678fbec2daef461143acf00cef3f94d7655fc)
* config: add interface_modify.sh and call it under flock to make modification ↵Stefan Metzmacher2010-01-201-52/+0
| | | | | | | | | | | on interfaces atomic When two releaseip events run in parallel it's possible that the 2nd script readds a secondary ip that was removed by the 1st script. metze (This used to be ctdb commit e02417b2a55c45ac2c125b1b3463c9c39e7bc07a)
* events/10.interfaces: move some parts to helper functionsStefan Metzmacher2010-01-201-28/+59
| | | | | | metze (This used to be ctdb commit 24cd42769d8f32b90a8876a6a08a36ab23076cd1)
* server: add "init" eventStefan Metzmacher2010-01-204-6/+15
| | | | | | | | | This is needed because the "startup" event runs after the initial recovery, but we need to do some actions before the initial recovery. metze (This used to be ctdb commit e953808449c102258abb6cba6f4abf486dda3b82)
* source the nfs sysconfig file from the 61.nfstickles scriptRonnie Sahlberg2010-01-201-0/+2
| | | | (This used to be ctdb commit 085d1bea78fabf754ef6dd6d323f74a1d361e45c)
* Revert "Use wbinfo --ping-dc isntead of wbingo -p sicne this is a more ↵Martin Schwenke2010-01-121-1/+1
| | | | | | | | | | reliable way to determine if winbindd is in a useful state." This reverts commit 7c95e56ba871a4e0cb893a5cb5d821e7ff6e6dd6. wbinfo --ping-dc is proving too unreliable. (This used to be ctdb commit b70021856e76df1ba407c83cfc19bf332fbfc869)
* Revert "events/50.samba: only use wbinfo --ping-dc if available"Martin Schwenke2010-01-121-6/+1
| | | | | | | | This reverts commit 7b73834ba3ac197cc8a3020c111f9bb2c567e70b. wbinfo --ping-dc is proving too unreliable. (This used to be ctdb commit 178f429a7b6d1008d35e857b6ca1df6adb60d255)
* Bond devices can have any name the user configures, soRonnie Sahlberg2009-12-091-8/+13
| | | | | | | | | | | | | when checking link status for an interface, first check if this interface is in fact a bond device (by the precense of a /proc/net/bonding/IFACE file) and use that file for checking status. Othervise assume ib* is an infiniband interface which we donnt know how to check, or otherwise it is an ethernet interface and ethtool should hopefully work. (This used to be ctdb commit 8cc6c5de3d7abb0b72eaa6e769e70963b02d84cb)
* make sure to also check that interfaces used for NATGW are okRonnie Sahlberg2009-12-091-0/+1
| | | | | | | and have a link. if not the node should become unhealthy (This used to be ctdb commit 03b5bbaae1b53830a4cd20d3079ab8f45ffce923)
* events/50.samba: only use wbinfo --ping-dc if availableStefan Metzmacher2009-12-081-1/+6
| | | | | | metze (This used to be ctdb commit 7b73834ba3ac197cc8a3020c111f9bb2c567e70b)
* Use wbinfo --ping-dc isntead of wbingo -p sicne this is a more reliable way ↵Ronnie Sahlberg2009-12-071-1/+1
| | | | | | to determine if winbindd is in a useful state. (This used to be ctdb commit 7c95e56ba871a4e0cb893a5cb5d821e7ff6e6dd6)
* Eventscripts: Fix syntax error in 00.ctdb.Martin Schwenke2009-12-011-0/+1
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9ea261f791ab919eb1ce5b37073b4f1d30694bb8)
* Eventscripts: Remove executable bit accidently set on some scripts.Martin Schwenke2009-12-013-0/+0
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 4c6e68ae942c05224c5f8b683fbc2dc1adced8ee)
* Eventscript argument cleanups and introduction of ctdb_standard_event_handler.Martin Schwenke2009-12-0115-63/+84
| | | | | | | | | | | | | | | | | | | The functions file no longer causes a side-effect by doing a shift. It also doesn't set a convenience variable for $1. All eventscripts now explicitly use "$1" in their case statement, as does the initscript. The absence of a shift means that the takeip/releaseip events now explicitly reference $2-$4 rather than $1-$3. New function ctdb_standard_event_handler handles the status and setstatus events, and exits for either of those events. It is called via a default case in each eventscript, replacing an explicit status case where applicable. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3d55408cbbb3bb71670b80f3dad5639ea0be5b5b)
* More eventscript cleanups. Initial smoke testing seems OK.Martin Schwenke2009-11-202-10/+20
| | | | | | | | | | Apart from lots of cleanup work, this also fixes a bug where the share checks didn't used to cope with directory names containing spaces. The previous commit also loaded the config incorrectly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 35a60a63a9b5c7d98dde514ae552239506b691c9)
* Now vaguely tested initscript updates.Martin Schwenke2009-11-197-88/+78
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f1e350f9edb74cc44b6c5be4c062fd93e98ba8c4)
* More untested eventscript factorisation.Martin Schwenke2009-11-1914-215/+130
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ac655b0a65b32d809d47fec9821f7f31bb2fe2a7)
* Eventscripts: Untested factorisations and introduction of status event.Martin Schwenke2009-11-134-81/+67
| | | | | | | | | | | | | This is the first stage of an experimental change to eventscripts. Ronnie and I did a few hours of factorisation of 40.vsftpd and applied many of the changes to 41.httpd. Other eventscripts were also modified. At this stage this is completely untested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 364e70b763f0ccd7714d15723ad3ea4d7e2968a1)
* Fix bashism in events.d/11.natgwMathieu Parent2009-11-101-1/+1
| | | | | | Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 6ccb495d1110157c06596763c7e252f3182c251e)
* Add a 99.timeout event script to trigger monitor timeouts.Michael Adam2009-11-051-0/+24
| | | | | | | | | | | This just sleeps for twice the value of EventScriptTimeout in the monitor action. It is not run by default, but can be activated by setting CTDB_RUN_TIMEOUT_MONITOR in /etc/sysconfig/ctdb . Michael (This used to be ctdb commit 1a3ecdee85b82bb3234a92ae6bcdeb92238eb7ee)
* add an extra test for the bond devices and check that there is an active slave.Ronnie Sahlberg2009-11-051-0/+4
| | | | | | this to handle the case where all links do have a physical layer, but where all slaves have been disabled using ifdown (This used to be ctdb commit bf50709630df000583f2b0ef0edc177c01d60eaf)
* dont verify winbindd is running properly at startupRonnie Sahlberg2009-11-041-6/+0
| | | | (This used to be ctdb commit 9e1b99221c8f257129641f6eda2795537b7ce9de)
* make the error logged when winbindd fails to access the dc during startup ↵Ronnie Sahlberg2009-10-291-1/+3
| | | | | | more scary and easier to spot in the logs (This used to be ctdb commit 0c9b0466fd87b3f1e5d53f867c863217802ac43b)
* Revert "update the "uptime" command to indicate the "time since last" is the ↵Ronnie Sahlberg2009-10-291-1/+1
| | | | | | | | time since the last recovery OR failover." This reverts commit 3b0d44497800a16400d05a30bdaf6e6c285d4b36. (This used to be ctdb commit cb36bbb5418290e8e5b770d2d836285b15da2a6f)
* update the "uptime" command to indicate the "time since last" is the time ↵Ronnie Sahlberg2009-10-291-1/+1
| | | | | | since the last recovery OR failover. (This used to be ctdb commit 3b0d44497800a16400d05a30bdaf6e6c285d4b36)
* add a check that winbind can actually talk to teh dc during the startup eventRonnie Sahlberg2009-10-271-0/+4
| | | | | | and refuse to start up if it can not (This used to be ctdb commit 4037b6e73a819a8e2463dfe0959b42875e05e106)
* treat interfaces with the name ethX* as bond devicesRonnie Sahlberg2009-10-211-1/+1
| | | | (This used to be ctdb commit 3997d7e5471810e9a2f145ce2e795073dfc5eded)
* wait a bit longer before shutting down when the reclock file is missingRonnie Sahlberg2009-10-191-3/+5
| | | | | | | pring the filename of the missing file when we turn unhealthy and also a 'df' (This used to be ctdb commit 97ded8a629ec762f71bad28515e4fbc810790b1d)
* Revert "dont shutdown a node when the reclock file is temporarily unavailable."Ronnie Sahlberg2009-10-191-2/+7
| | | | | | This reverts commit f5e9f3007c10a937158bc8cdfabf33c984cf9c50. (This used to be ctdb commit 02f68dc60e0b7bf26d631850b12834d5c71a88f2)
* dont shutdown a node when the reclock file is temporarily unavailable.Ronnie Sahlberg2009-10-151-7/+2
| | | | | | | Leave the node as UNHEALTHY this stops clients from accessing the node until the reclock file can be accessed again (This used to be ctdb commit f5e9f3007c10a937158bc8cdfabf33c984cf9c50)
* always create the nfs state directories during the monitor event.Ronnie Sahlberg2009-10-141-0/+5
| | | | | | this allows us to configure and enable nfs at runtime without having to restart ctdbd (This used to be ctdb commit f6e39d35713475defaa08a623e194f3f2f8f7d53)
* Merge commit 'martins/master'Ronnie Sahlberg2009-10-121-2/+2
|\ | | | | | | (This used to be ctdb commit 5f14874c5c705dd637f88a77f30c930fea1201d2)
| * 40.vsftpd: reset the fail counter in the "recovered" event.Martin Schwenke2009-10-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each recovery that involves IP reassignments results in a restart of vsftpd in the "recovered" event. Currently, we can have several recoveries in quick succession and the "monitor" event following each can fail because vsftpd isn't ready yet. This results in cumulative failures, so the node is marked unhealthy, even though vsftpd has never had a proper opportunity to become ready. This resets the fail count after each recovery. While we're here, also move the delete of the restart flag file into the body of the conditional. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 318abeb4b913a8d846e7eaf4cf5c2a67b61ce974)
* | update natgw eventscript to allow you to fore it to update and / or to ↵Ronnie Sahlberg2009-10-061-2/+2
|/ | | | | | remove the configuration at runtime (This used to be ctdb commit deed52b7e4aac94b4d11a8d89d08739e1dfd4ed7)
* Minor fixes to 01.reclock eventscript.Martin Schwenke2009-09-301-3/+2
| | | | | | | | | test -z really needs its argument to be quoted. Simplified a status test. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fe26da7780545b1ecc0a7da5bc1cf8beaeea94cc)
* 40.vsftpd monitor event only fails after 2 failures to connect to port 21.Martin Schwenke2009-09-302-14/+31
| | | | | | | | | | | | | | | | | | | | Change the monitor event in 40.vsftpd so it only fails if there are 2 successive failures connecting to port 21. This reduces the likelihood of unhealthy nodes due to vsftpd being restarted for reconfiguration due to node failover or system reconfiguration. New eventscript functions ctdb_counter_init, ctdb_counter_incr, ctdb_counter_limit. These are used to count arbitrary things in eventscripts, depending on the eventscript name and a tag that is passed, and determine if a specified limit has been hit. They're good for counting failures! These functions are used in 40.vsftpd and also in 01.reclock - the latter used to do the counting without these functions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit cfe63636a163730ae9ad3554b78519b3c07d8896)
* From Wolfgang Mueller-FriedtRonnie Sahlberg2009-09-291-23/+0
| | | | | | | | | | | | | | | Remove the explicit vacuum/repack commands from the 00.ctdb eventscript and implement this in the ctdb daemon. Combine vacuuming and repacking into one cheap read traverse to enumerate all candidate records and one write traverse that both repacks the database and also deletes the record locally where we are lmaster and where the records have already been deleted remotely. this code also adds initial autotuning heuristics for the vacuum intervals and how many records to delete in each iteration. minor stylish changes made by ronnie s (This used to be ctdb commit 95a3ee551241aa164967991fe5efe078e1714bde)
* change the reclock fail count to 19 monitor intervals before we shut down ctdbdRonnie Sahlberg2009-09-281-2/+2
| | | | (This used to be ctdb commit 6e35feb06ec036b9036c5d1cdd94f7cef140d8a6)
* add a new eventscript 01.reclockRonnie Sahlberg2009-09-281-0/+58
| | | | | | | | | | if the reclock file has been set, then this script will test that the reclock file can actually be accessed. if the file does not exist, or if the attempts to stat the file hangs, the node will be marked unhealthy after the third failed monitoring event and after the tenth failure, ctdb itself will shutdown. (This used to be ctdb commit 2cb04747887674def299e574fccb827c1c3194e7)
* try restarting ststd indefinitely not just onceRonnie Sahlberg2009-09-151-25/+9
| | | | (This used to be ctdb commit 03b0d913ae009284e2fadda1b9246ec77d19db29)