summaryrefslogtreecommitdiffstats
path: root/ctdb/config
Commit message (Collapse)AuthorAgeFilesLines
...
* scripts: ctdb-crash-cleanup.sh uses initscript to see if ctdbd is runningMartin Schwenke2013-04-181-1/+1
| | | | | | | | | | | "ctdb ping" can time out. How many times should we try? Instead, depend on the initscript to implement something sane. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 90cb337e5ccf397b69a64298559a428ff508f196)
* initscript: Use a PID file to implement the "status" optionMartin Schwenke2013-04-181-30/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using "ctdb ping" and "ctdb status" is fraught with danger. These commands can timeout when ctdbd is running, leading callers to believe that ctdbd is not running. Timeouts could be increased but we would still have to handle potential timeouts. Everything else in the world implements the "status" option by checking if the relevant process is running. This change makes CTDB do the same thing and uses standard distro functions. This change is backward compatible in sense that a missing /var/run/ctdb/ directory means that we don't do a PID file check but just depend on the distro's checking method. Therefore, if CTDB was started with an older version of this script then "service ctdb status" will still work. This script does not support changing the value of CTDB_VALGRIND between calls. If you start with CTDB_VALGRIND=yes then you need to check status with the same setting. CTDB_VALGRIND is a debug variable, so this is acceptable. This also adds sourcing of /lib/lsb/init-functions to make the Debian function status_of_proc() available. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 687e2eace4f48400cf5029914f62b6ddabb85378)
* statd-callout: Make sure statd callout script always runs as rootAmitay Isaacs2013-04-082-0/+6
| | | | | | | | | | | In RHEL 6+, rpc.statd runs as "rpcuser" instead of root as on RHEL 5. This prevents CTDB tool commands talking to daemon since "rpcuser" cannot access CTDB socket. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit fe8c4880b371492a38554868d4ca10918c54e412)
* eventscripts: Remove calls to "smbstatus -np" for samba cleanupAmitay Isaacs2013-02-111-29/+3
| | | | | | | | | | This is an artifact from older versions of Samba. In the newer versions of Samba, "smbstatus -np" command does not do anything useful, but causes a traverse in CTDB which is expensive and causes CPU utilization to shoot up. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 053b89c6dbce47001505524606889334559d2ec4)
* initscript: export CTDB_EXTERNAL_TRACEMartin Schwenke2013-02-051-1/+1
| | | | | | | | | This means it can be set like any other configuration option in the configuration file, without needing to export it there. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a0ef73e197dc9147f7718e0813fe803ff0b3d54d)
* ctdbd: Remove command-line option --debug-hung-scriptMartin Schwenke2013-02-051-1/+7
| | | | | | | | | | | | | | | Use an environment variable instead. This just means that the initscript exports CTDB_DEBUG_HUNG_SCRIPT and the code checks for the environment variable. The justification for this simplification is that more debug options will be arriving soon and we want to handle them consistently without needing to add a command-line option for each. So, the convention will be to use an environment variable for each debug option. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0581f9a84e58764d194f4e04064c2c5b393c348b)
* doc: allows to -> allows one toMathieu Parent2013-01-221-1/+1
| | | | | | Signed-off-by: Mathieu Parent <math.parent@gmail.com> (This used to be ctdb commit 95fc493a7d4145f976cb3fe928d9e92faec4dd71)
* Changes for unobtrusive recovery and new method for health check.Srikrishan Malik2013-01-113-75/+156
| | | | | | | | | | | | Unobtrusive recovery: Ganesha will not be restarted on failovers. Ganesha health: Use the counters in /var/lib/nfs/ganesha_local to track progress instead of the null call which can timeout if the server is too busy. Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com> Signed-off-by: Lance Russell <lancerus@us.ibm.com> (This used to be ctdb commit 0e651e9da0f1f3c836b4474612ab13d0ccd272d9)
* eventscripts: Fail the setup event if CTDB does not become readyMartin Schwenke2013-01-091-4/+3
| | | | | | | | Currently it silently continues without attempting to set tunables. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 735ec99b99c7bb579851ce8293011aaf1dcc552a)
* scripts: Make script_log() use supplied message, stop logger from hangingMartin Schwenke2013-01-081-1/+1
| | | | | | | | | | When using syslog any provided message arguments are ignored and not passed to logger. This means that logger blocks waiting on stdin. That's bad. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 50abf597cefe6f8ea2a2ff7694bf84641344a9b1)
* scripts: Rework ctdb-crash-cleanup.sh so that it uses existing functionsMartin Schwenke2013-01-081-32/+17
| | | | | | | | This improves maintainability. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e2aaa64925cca359c71520e01a18fc9461b0da4d)
* scripts: Make drop_all_public_ips() more robustMartin Schwenke2013-01-081-3/+32
| | | | | | | | | | | | Incorporate some of the logic from ctdb-crash-cleanup.sh that ensures IPs are deleted even if they have the wrong netmask or are on the wrong interface. Factoring out some of the code will allow it to be used elsewhere. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 03356fd5ae7a3ac35fde0289cbea7c71ecf07367)
* scripts: debug-hung-script.sh doesn't need functions/loadconfigMartin Schwenke2013-01-081-3/+0
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8507303b525d20c74e8ec4e7c4f5f275945cd3b6)
* scripts: statd-callout should calculate CTDB_BASE if it is not setMartin Schwenke2013-01-081-3/+2
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 376015ba5ad6b7703ae9949a1d40a0c72dfaba0c)
* eventscripts: Each script should set CTDB_BASE if it is not setMartin Schwenke2013-01-0819-0/+57
| | | | | | | | This makes it easier to run the scripts externally. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 740ea8ea5084149c8b552a01ee1c98c558b12384)
* scripts: Move drop_all_public_ips() to the functions fileMartin Schwenke2013-01-082-6/+6
| | | | | | | | ... so it can be improved and used elsewhere. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b23c30253cc9eb274b895cac0f8c65245ba0a200)
* Eventscripts: Change the default reconfigure action to do nothingMartin Schwenke2013-01-075-14/+12
| | | | | | | | | | | | | A default action of restarting the service doesn't obey the principle of least surprise. It cause the NFS service to be implicitly reintroduced. This allows no-op functions to be removed from some eventscripts and service restart functions to be added to others. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c75b5e5b4d000f5c7dab403df8238ceed390c1c0)
* Eventscripts: Do not restart NFS on reconfigureMartin Schwenke2013-01-071-2/+0
| | | | | | | | | | | | | | | | | | | | It looks like this restart was accidentally reintroduced in commit fc0678d351187cfa4c71123f97c0f493aacd5d16 when $service_reconfigure became unset so the default action of restarting the service would occur. From there cleanups have explicitly reintroduced it and carried it through the code. Also update the unit tests affected by this change. The restart was originally removed in commit bc481c3f1a44c50648488c4f8a7f15ec395d446f. The default reconfigure action of restarting a service is clearly suboptimal and will be addressed in a separate patch. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2629de72e1f37b5e46772c2ef8d8d0012fc4ed37)
* Initscript: when checking status, print output of "ctdb ping" if it failsMartin Schwenke2013-01-071-1/+4
| | | | | | | | | At the moment the caller has no idea why it thinks CTDB isn't running and we can't debug failures... Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 776590bf84d221092298346a28d7fc0552a67c9d)
* events/50.samba: fix testparm background updateMichael Adam2013-01-051-1/+1
| | | | | | | | | | creating the smb.conf cache with "-v" results in a cache file that fails to load with "testparm -s ..." later on due to "copy = " not being processable. (Copying the empty service name fails). Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 81788cfabe960497b050c5ee4e4e487ee061012a)
* Eventscripts: 10.interface should list configured interfacesMartin Schwenke2012-11-191-3/+3
| | | | | | | | | | | | | The current code lists available interfaces. If IPs are configured in some other way than the public addresses file (e.g. ctdb addip) and their interfaces default to being marked down then, since down interfaces are not available, these interfaces can never be marked up. The configured interfaces should be listed instead. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d8f010355b715e49709836e057a5d0f110919897)
* Eventscripts: 10.interface startup event should only process interfaces onceMartin Schwenke2012-11-141-7/+4
| | | | | | | | | | | | | | | | Provided that monitor_interfaces() sets the state of each interface, there's no need to mark all interfaces as up before running monitor_interfaces() in the startup event. monitor_interfaces() will set the true status of each interface anyway. The duplication is unnecessary and may cause extra action in the recovery daemon because the state of some interfaces is changed an extra time. Instead, add a comment at the top of the loop in monitor_interfaces() to warn against early loop exits. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f243a916ee71013f7402b9c396c2ead88eb3aab0)
* Avoid a bashism in 60.ganeshaVolker Lendecke2012-10-241-1/+1
| | | | | | | This file is #!/bin/sh. On sn-devel at least, with this /bin/sh the shell does not like == for string equality. (This used to be ctdb commit e2213db479129ce9c2b2fb88ec8c53cbd33d54b3)
* scripts: Refactor logging code in initscript and functions fileMartin Schwenke2012-10-182-23/+24
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5ee242c949a98bb7397e0f7368b20d44c06fe772)
* initscript: Check that rc.ctdb is executable before running itMartin Schwenke2012-10-181-1/+1
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 59a47c0674bacfebc17a1b44f0244727bf2fa7a4)
* Revert "Eventscripts - add facility to 10.interface to delete unmanaged IPs"Martin Schwenke2012-10-181-29/+0
| | | | | | | | | | This reverts commit 88f88d86b0d08240f749fb721b8c401c2eeb1099. This is dangerous and, on reflection, I can't see it being useful. There are often permanent IPs on interfaces that CTDB shares with its public IPs. (This used to be ctdb commit 16aba4eb620844626a1c71c58b51658caf44dea6)
* Eventscripts: "recovered" event should not fail on NATGW failureMartin Schwenke2012-10-181-5/+25
| | | | | | | | | | | | | | | | The recovery process has no protection against the "recovered" event failing, so this can cause a recovery loop. Instead of failing the "recovered" event, add a "monitor" event and fail that instead. In this case the failure semantics are well defined. A separate patch should ban nodes if the "recovered" event fails for an unknown reason. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit eaa7c165f58abd7e259c37d76b7dd37c91e13d9f)
* common: Debug ctdb_addr_to_str() using new function ctdb_external_trace()Martin Schwenke2012-10-181-0/+3
| | | | | | | | | | | | | | | | | We've seen this function report "Unknown family, 0" and then CTDB disappeared without a trace. If we can reproduce it then this might help us to debug it. The idea is that you do something like the following in /etc/sysconfig/ctdb: export CTDB_EXTERNAL_TRACE="/etc/ctdb/config/gcore_trace.sh" When we hit this error than we call out to gcore to get a core file so we can do forensics. This might block CTDB for a few seconds. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 7895bc003f087ab2f3181df3c464386f59bfcc39)
* config/functions: fix a commentMichael Adam2012-10-171-1/+1
| | | | | | | | ctdb_check_counter_limits does not fail but succeed if count >= limit Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit af540ef728303b4a0a188b17c695e9aefab34489)
* doc: Add info about execute permissions on event scriptsAmitay Isaacs2012-10-171-0/+2
| | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 25d886060b138bc5e78fe93d7bebe3990264f29d)
* doc: Fix documentation for setup eventAmitay Isaacs2012-10-171-5/+3
| | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 36d25e96a2f8ae1461c5a708a2922f0475a39900)
* scripts: Remove duplicate code from init script to set tunablesAmitay Isaacs2012-10-172-21/+30
| | | | | | | | | | | | | | The tunable variables defined in CTDB configuration file are currently set up from init script as well as part of "setup" event in 00.ctdb eventscript. Remove the duplication of this code and set tunable variables only from setup event. During the "setup" event, it's possible that ctdb tool commands can timeout if CTDB daemon is not ready. To guard against such eventuality, wait till "ctdb ping" command succeeds before executing any other ctdb tool commands. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 632c1b9c1cc2e242376358ce49fd2022b3f27aa2)
* Eventscripts: Add support for "reconfigure" pseudo-event for policy routingMartin Schwenke2012-10-111-2/+17
| | | | | | | | | This rebuilds all policy routes and can be used if the configuration changes. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c185ffd2822fcee26d07398464c59b66c61f53fa)
* Eventscripts: Add service-start and service-stop pseudo-eventsMartin Schwenke2012-10-101-2/+28
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit be4ad110ede9981b181ac28f31ffd855a879d5df)
* eventscripts: Auto-start/stop services in backgroundMartin Schwenke2012-10-031-2/+26
| | | | | | | | | | | If $CTDB_SERVICE_AUTOSTARTSTOP="yes" then service start/stop is done in the background with logging. Fix some unit tests for samba and winbind. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3a3dae4cb5ec8b4b8381a4013adda25b87641f3a)
* Eventscripts: split 50.samba into 49.winbind and 50.sambaMartin Schwenke2012-10-032-143/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | winbind and samba can be separately managed. This makes the service starting and stopping code way too complicated, and even adds a small amount of complexity to the monitoring code. The sensible option is to split this eventscript in two. There are two potentially backward incompatible changes here: * Functionality has been removed that allowed 50.samba to manage winbind when CTDB_MANAGES_WINBIND was unset but the smb.conf "security" parameter was set to "ADS" or "DOMAIN". Maintaining this functionality would have required moving the testparm-related code to the functions file, deciding where the cache file should go, and then calling it from both 49.winbind and 50.samba. This feature wasn't of great value and asking administrators to set an extra variable in exchange for code simplicity seems like a reasonable deal. * External code will need to be changed if it calls 50.samba directly with winbind-related expectations. This is fairly obvious! Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 34535ae64420926b9a3bf7d453fed4e6f4c90115)
* Initscript: Kill any existing ctdbd processes if the ping succeedsMartin Schwenke2012-10-021-0/+6
| | | | | | | | | Initialising a new ctdbd will destroy the Unix domain socket so existing processes will be useless anyway. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 043ef77086797a703aec436a26a05c56a1bcbf2b)
* Eventscripts: Indent error when a route delete fails in 11.per_ip_routingMartin Schwenke2012-09-111-2/+8
| | | | | | | | | This puts it under the umbrella of the previous warning that should also have been printed. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5c3be8f26dcde0b1b3d86928953e74d4a8b35958)
* eventscripts: 13.per_ip_routing should remove bogus routes on ipreallocatedMartin Schwenke2012-09-111-0/+26
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d0d0a6f19960f233224970b8d5d19b0e37222616)
* eventscripts: Print a warning on failure to delete a routing ruleMartin Schwenke2012-09-111-4/+12
| | | | | | | | | | | | del_routing_for_ip() currently fails silently, which could hide real errors. In add_routing_for_ip() we don't want to see any error when calling del_routing_for_ip(), since we don't expect the rule to be there. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 30d69defa7e97ab5e3ba0492a27868dde2616494)
* Eventscripts: 13.per_ip_routing should always fail if config is missingMartin Schwenke2012-07-301-2/+11
| | | | | | | | | | | | | | Currently, if the configuration file is specified by $CTDB_PER_IP_ROUTING_CONF but is missing, takeip fails but (the absent) monitor event "succeeds", so the state of a node will flip-flop. Instead of this, if the configuration file is missing then fail early on for all events. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c64c6c77c3f6aa2898e5a575547b587bea868c76)
* Revert "Eventscripts - make 13.per_ip_routing fail gracefully if config is ↵Martin Schwenke2012-07-301-7/+2
| | | | | | | | | | | | | | missing" When the configuration file is missing this causes the node to flip-flop betwen unhealthy (when takeip fails) and healthy (no monitor event here). Will reimplement this properly. This reverts commit 351ca413eec460330571ca8b01ad269728fe15df. (This used to be ctdb commit 5277d749c9111716fd723647d5421907476422bf)
* Eventscripts: Clean up 11.routingMartin Schwenke2012-07-301-9/+8
| | | | | | | | | | The loops can all be done without cat or grep. The pair of loops in updateip is combined into a single loop. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 96fdda124f5511fb76190e7c7a7f0b98e6b01a31)
* Initscript: clean up drop_all_public_ips()Martin Schwenke2012-07-261-7/+3
| | | | | | | | | This makes the case implicit where $CTDB_PUBLIC_ADDRESSES is unset. This is OK because that's not an interesting code path. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5b2725d1ae052e848c2487cb10c5393a877d118c)
* statd-callout: Fix a bug in the calculations of $STATEMartin Schwenke2012-07-261-3/+2
| | | | | | | | | | | | It is just meant to be even, so divided *and* multiplied by 2. Use $(( )) to make it more readable. While touching this code, make the related calculation a bit more readable too. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 25d45e69f4ffc2b26061ac13038d52a353e79e61)
* Eventscripts: Default route on NAT gateway should have a metric of 10Martin Schwenke2012-07-261-1/+1
| | | | | | | | | | | | At the moment routes from 11.routing can fail to be added because they conflict with the default route added by 11.natgw. NAT gateway is meant to be a last resort, so routes from 11.routing should override it. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 624f4677e99ed1710a0ace76201150349b1a0335)
* Eventscripts: Update/remove stale comments in 11.natgwMartin Schwenke2012-07-261-7/+2
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5d713d5e5be67f5914a661694c15d938bd67dea3)
* Eventscripts: Retrieve and build NAT gateway details better in 11.natgwMartin Schwenke2012-07-261-9/+8
| | | | | | | | | | | | | | | | | | | * "ctdb natgw" is run twice when it doesn't need to be. * Tweak the parsing of "ctdb natgw" output so that it is done by the shell instead of a bunch of external processes. * Make default NAT gateway be -1, even on error. If the process failed entirely then it could previously be empty. * Streamline the error handling using die() for when there is no NAT gateway. * Downcase script-local variable names. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 630cfe6451ba23d959fa4907fbba42702337ed3b)
* Eventscripts: Optimise building the host address in 11.natgwMartin Schwenke2012-07-261-3/+3
| | | | | | | | | | It can be build without forking unnecessary processes. Also downcase variable name because it is local to script. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 34f58a0773618c4508a55ad75fc4602dad5a5f4c)
* Eventscripts: Clean up startup sanity check in 11.natgwMartin Schwenke2012-07-261-8/+3
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f6e421e8bf935cae790a6dc2b861eb9c7f8610b4)