summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * | Eventscripts - 10.interfaces should not check orphaned interfaces.Martin Schwenke2011-08-021-8/+5
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the last IP address on an interfaces is removed then that interfaces should no longer be checked by 10.interfaces. However, "ctdb ifaces" still lists such interfaces so they are currently checked. The problem really needs to be addressed in ctdbd but a neat quick eventscript fix will be minimally invasive... This changes the code to use "ctdb -Y ip -v" instead of "ctdb -Y ifaces". The former includes details of all public addresses and associated interfaces, so when an address is removed there is no output for it. This avoids orphaned interfaces from being listed. The logic is also slightly improved so that $IFACES includes just a (non-uniquified) list of interfaces, allowing an existing loop to be removed. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 49b2d1bd9554461ed8edbfc21e777c0eca9e1443)
| * Merge branch 'master' of 10.1.1.27:/shared/ctdb/ctdb-masterRonnie Sahlberg2011-07-295-113/+955
| |\ | | | | | | | | | (This used to be ctdb commit 518945e59e2e48f07fcc0955f3aa81cd0d946aea)
| | * Tests: Initial test code for LCP2 IP allocation algorithm.Martin Schwenke2011-07-294-7/+418
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move struct ctdb_public_ip_list to ctdb_private.h and put some definitions for some functions from ctdb_takeover.c there. This allows those functions to be called from unit tests. Add ctdb_takeover_tests.c and the Makefile support to build it. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9d34be0233edf3bc022345c0494c4b2a4d7f8480)
| | * IP allocation - add LCP2 algorithm.Martin Schwenke2011-07-293-106/+537
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current non-deterministic IP allocation algorithm balances IPs across the whole cluster. It does not consider different interfaces/VLANs/subnets, so these different groups of IPs aren't generally well balanced. This adds the LCP2 algorithm for IP allocation and allows it to be enabled by setting the "LCP2PublicIPs" tunable to 1. The LCP2 algorithm calculates the imbalance of a node by totalling the squares of the distances between each IP on the node. The IP distance is defined as the length longest common prefix (LCP) of bits that is found when comparing 2 IPs. The imbalance of a cluster is the maximum imbalance for any node. At each step the algorithm selects an allocation to the IP/node combination that results in the choosing the allocation that best reduces the imbalance of the cluster. The implementation splits out the IP allocation part of ctdb_takeover_run() into new function ctdb_takeover_run_core(), and then extracts out the basic IP assignment code into new functions basic_allocate_unassigned() and basic_failback(). 3 new functions lcp2_init(), lcp2_allocate_unassigned() and lcp2_failback() implement the LCP2 algorithm, and are hooked into ctdb_takeover_run_core(). Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 61fc7fbd0235469df22deb6581c6bd47e30bc0be)
| * | Merge branch 'master' of 10.1.1.27:/shared/ctdb/ctdb-masterRonnie Sahlberg2011-07-292-14/+44
| |\| | | | | | | | | | (This used to be ctdb commit 0e60a738f9a6275ed45abc3d933f872d93132d92)
| | * Update the delip commandRonnie Sahlberg2011-07-291-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | Dont talloc_free(vnn) immediately but postphone it until later when the eventscript callback has completed. CQ S1026664 (This used to be ctdb commit 0a99e8742a261b1d3a2c8830f5c19ea6c2c47cad)
| | * eventscript: fix callback after freeRusty Russell2011-07-291-11/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ctdb_event_script_callback() takes a mem_ctx arg which it doesn't use, but the implication is pretty clear, that when that mem_ctx is freed, the callback shouldn't happen. Indeed, Ronnie reproduced a case where that callback refers to freed memory, in the ip reallocation code under stress. So attach the callback to the mem_ctx they give us, and remove it from the script state structure when that's freed. It's a bit weird, but it works. CQ: S1026179 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 6fcd867cc835ef1ffc1c50964f135c346503d40c)
| * | packaging: honour rpm build target options handed in to makerpms.shMichael Adam2011-07-221-1/+12
| | | | | | | | | | | | | | | | | | This allows to call e.g. "makerpms.sh -bs" to build only the source RPM. (This used to be ctdb commit c6bfba2bb66962b7b05d708f0747002700991472)
| * | Merge branch 'master' of ssh://git.samba.org/data/git/ctdbRonnie Sahlberg2011-07-207-11/+297
| |\ \ | | | | | | | | | | | | (This used to be ctdb commit a1b3661973489f0111e7975fec422fb99a25f0c8)
| | * | web: correctly terminate list items <li> with </li> instead of with <br>Michael Adam2011-07-081-5/+5
| | | | | | | | | | | | | | | | (This used to be ctdb commit 3f698e69a56305c5ec27b8d119bf2d57d5cd2ec6)
| | * | web: add Stefan Metzmacher to the list of CTDB developers.Michael Adam2011-07-081-0/+1
| | | | | | | | | | | | | | | | (This used to be ctdb commit 912a33cebe7c51b33cda2e6d5f2b3a481fa7fd49)
| | * | client: handle transient connection errorsDavid Disseldorp2011-06-231-5/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Client connections to the ctdbd unix domain socket may fail intermittently while the server is under heavy load. This change introduces a client connect retry loop. During failure the client will retry for a maximum of 64 seconds, the ctdb --timelimit option can be used to cap client runtime. Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit dc0c58547cd4b20a8e2cd21f3c8363f34fd03e75)
| | * | Manpage for ping_pongMathieu Parent2011-06-235-1/+261
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit af75d3e37412e03d3978073edbe6dee78f265c3c)
| * | | Add a text about "ban" "unban" not being permanent and htat recovery daemon ↵Ronnie Sahlberg2011-07-091-0/+3
| | |/ | |/| | | | | | | | | | | | | can auto unban nodes. Suggest using "stop" / "continue" instead. (This used to be ctdb commit 8e30dffad5b1385818b2d7350d6c3767a220d745)
| * | When trying to re-balance the ip assignment and shuffle ips fromRonnie Sahlberg2011-07-061-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nodes with many addresses to nodes with few addresses, loop up to num_ips+5 times instead of only 5 times. When we have very many public ips per node, we might need to loop more than 5 times or else we will exit without reaching optimal balance. (This used to be ctdb commit aa8114a625a637277561a66c80bdece3c27e9e20)
| * | Add log output to wipedb and backupdbRonnie Sahlberg2011-07-061-0/+5
| | | | | | | | | | | | | | | | | | CQ S1025379 (This used to be ctdb commit 6f51d4a75f8a9f2cdb8ecde946ed31809ab5a415)
| * | change the name for the key for the record where we stoire the public ↵Ronnie Sahlberg2011-06-281-1/+1
| |/ | | | | | | | | | | | | | | address config from public-addresses... to public_addresses... CQ1019030 (This used to be ctdb commit 114d5034ff4880848588caf493382a537a1469ae)
| * onnode: fix natgwlist nodespecMartin Schwenke2011-06-081-5/+21
| | | | | | | | | | | | | | | | | | | | | | | | This hasn't worked for a while if ever. We treat this case specially because the output has 2 works on the 1st line. We also handle the error case where /etc/ctdb_natgw_nodes exists but none of the other $NATGW_* configuration is done. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 66e89797c7866d207a5bbf1836f52d70dba7cea6)
| * onnode: fix get_nodes_with_status()Martin Schwenke2011-06-081-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Setting IFS and looping though items with colons in them doesn't work. Change this to read through the output line by line. The header line needs to be thrown away by throwing away everything up to the 1st newline. Keep stderr from the "ctdb status" command, otherwise debugging is impossible. On error, append any output from ctdb to onnode's error message. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d60592cf99999f10344a05ef0571fb300bb9d97c)
| * onnode: Remove an unnecessary comment.Martin Schwenke2011-06-081-1/+0
| | | | | | | | | | | | | | | | | | The comment about $CTDB_NODES_SOCKETS is meaningless. The code ti refers to works just find with $CTDB_NODES_SOCKETS. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 74e69a564bac653dadfffe8b08145b9b3be16e61)
| * onnode: Future-proof get_nodes_with_status().Martin Schwenke2011-06-081-28/+29
| | | | | | | | | | | | | | | | | | | | | | The current code requires knowledge of the number of status bits output by "ctdb status -Y". This changes the code to be completely general. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit e1788f25fde3d1f26bf4831a331741aa280f6fbc)
| * onnode: Exit with error for unknown command-line flags.Martin Schwenke2011-06-081-1/+3
| | | | | | | | | | | | | | | | Use of "local" was masking errors in command-line processing. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ca80adda7517b43147ef30156ae34c66b29fa2bd)
| * onnode: Be defensive when listing IPs of nodes with designated status.Martin Schwenke2011-06-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current version gives the last item left after stripping the known fields. If an insufficent number of status fields is stripped then this would return a residual status field value, which turned out to be a valid IP address for localhost... so no error occurs. This change means that the node number is stripped and any residual status field value will stay appended, causing an error the first time this command is tested. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 74715e6ec7b67c6f0e863aa51c87279758d6bf91)
| * onnode - Fix long standing bug in onnode healthy/ok/connected/con.Martin Schwenke2011-06-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | When the output of "ctdb status -Y" changed to add an extra status column we didn't fix onnode. This adds a match for the extra column. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 793febaebd3d484ddfbbcb47aaa0cdf3cfc1a00d)
| * Fix bashismMathieu Parent2011-05-141-1/+1
| | | | | | | | | | | | | | | | ... again ;-) Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 2266586c1839af032622be54dc7f71e39d2bd9ef)
| * Merge branch 'master' of ssh://git.samba.org/data/git/ctdbRonnie Sahlberg2011-05-123-110/+116
| |\ | | | | | | | | | (This used to be ctdb commit 307e915459c26a728a1ec16bd735d983d493df53)
| | * doc: regenerate ctdb docsMichael Adam2011-05-122-97/+101
| | | | | | | | | | | | (This used to be ctdb commit 2d67186e5acd5aa8cb3eb1f4fbd4a41153c52e96)
| | * doc/ctdb.1.xml: update listvars documentationLuk Claes2011-05-121-10/+12
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Luk Claes <luk@debian.org> Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit afd96d5990815019b1f9ddc8b78a05f86eca0421)
| | * doc: regenerate ctdb docsLuk Claes2011-05-122-79/+79
| | | | | | | | | | | | | | | | | | | | | Signed-off-by: Luk Claes <luk@debian.org> Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 39f595cad5321c64e2b1e72fe7b4bbb720f4b906)
| | * doc/ctdb.1.xml: Fix typoLuk Claes2011-05-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | s/poerwoff/poweroff/ Bug 8124 Signed-off-by: Luk Claes <luk@debian.org> Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit a6d2f1bd552dba33640acb7a0b8110534debd4ce)
| * | When using multiple VLANs, some funky stuff can sometimes happen whenRonnie Sahlberg2011-05-121-7/+6
| |/ | | | | | | | | | | | | | | | | | | | | | | | | adding/removing IP addresses causing routes might be dropped by the system. The easiest workaround for this is to unconditionally try to reapply all static routes for all interfaces once ipreallocation has finished, not just adding them back on the affected interface. This worksaround a funky issue in CQ S1023538 (This used to be ctdb commit 84600d1f53632d5fe76c308727f31f61b5ec1010)
| * Remove all checking of GPFS from ctdb_diagnosticsRonnie Sahlberg2011-05-111-32/+0
| | | | | | | | | | | | CQ S1023524 (This used to be ctdb commit 4cddba08b46db0a56a86b32403a41b89cd097317)
| * If samba fails to start for some reason, make this cause the startup event ↵Ronnie Sahlberg2011-05-101-3/+14
| | | | | | | | | | | | | | | | | | | | to fail too, so that ctdbd will re-try the startup event later. Or else this will leave samba not running. CQ S1023394 (This used to be ctdb commit f90485b08d32cbe56050718a3b28ca0fe1d64e0f)
| * Dont exit from checking interfaces once we have found one interface that is notRonnie Sahlberg2011-05-101-1/+1
| | | | | | | | | | | | in use by public addresses. this can happen when we have removed existing interfaces/ip addresses and prevents us from verifying the status of other interfaces (This used to be ctdb commit d67955b42f7627be9dae995230c8fcbb8a948ec2)
| * Remove logging of spam/errors from the 10.interfraceRonnie Sahlberg2011-05-091-10/+9
| | | | | | | | | | | | | | | | script if/when we have for example NATGW configured but no public addresses defined on that interface CQ S1023378 (This used to be ctdb commit 8837daa424732aeb5a20814b1709c345a97a0e09)
| * packaging: add ltdbtool and its manpage to the RPMMichael Adam2011-05-041-0/+2
| | | | | | | | (This used to be ctdb commit ce6409dc7d059701f0fe4b57e7c05c38c66629c5)
| * install the ltdbtool manpage with "make install"Michael Adam2011-05-041-0/+1
| | | | | | | | (This used to be ctdb commit ffbff1affed8301831387e23b4f8f824d9f78e20)
| * install ltdbtool with "make install"Michael Adam2011-05-041-0/+1
| | | | | | | | (This used to be ctdb commit 991ea66e5ed0eb7ab256dc8e3118dc78462d4752)
| * build "ltdbtool" in "make all"Michael Adam2011-05-041-1/+1
| | | | | | | | (This used to be ctdb commit d91e80c698a7706460e9ee74bd4f5a9ab0a7b9b1)
| * ltdbtool: add manpage html + roffGregor Beck2011-05-042-0/+342
| | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 992baa4215bfc1b29fd153ccb7c42bb0cb66fa4f)
| * ltdbtool: add manpageGregor Beck2011-05-042-1/+232
| | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 2ed3603274cd38dde4ae98eef653e9a9de631eb5)
| * add ltdbtool - a standalone ltdb toolGregor Beck2011-05-042-0/+381
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This this is a tool to handle (dump and convert) ctdb's local tdb copies (ltdbs) without connecting to a ctdb daemon. It can be used to * dump the contents of a ltdb, printing the ctdb record header information * dump a non-clustered tdb database (like tdbdump) * convert between an ltdb and a non-clustered tdb (adding or removing ctdb headers) * convert between 64 and 32 bit ltdbs (the ctdb record headers differ by 4 bytes of padding) usage: bin/ltdbtool dump [-p] [-s{0|32|64}] <idb> bin/ltdbtool convert [-s{0|32|64}] [-o{0|32|64}] <idb> <odb> Pair-Programmed-With: Michael Adam <obnox@samba.org> (This used to be ctdb commit efcf2815711cd5371633614fb91273bd0a786da0)
| * ctdb catdb: fix escaping of '"' and '\'Gregor Beck2011-05-041-1/+1
| | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 2b5cb0841fd813cd54be170c305a828885e0f038)
| * Dont call the UPDATE event if both old and new interface is the same.Ronnie Sahlberg2011-05-041-3/+14
| | | | | | | | | | | | CQ S1018175 (This used to be ctdb commit 6a74515f0a1e24d97cee3ba05d89133aac7ad2b7)
| * Cleanup of logging messages/spammingRonnie Sahlberg2011-05-042-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduce an infomational message about not performing ip reallocation from NOTICE(the default) to INFO. These messages are normal during startup or when stopped/banned when we will be in recovery mode for a while. Remove a messager in the loop waiting for initial startup to complete about the generation being invalid. It is always invalid at this stage before we have finished initial recovery. Rate-limit the informational messages for CTDB_WAIT_UNTIL_RECOVERED so that we only print them once per second for the first 60 seconds and after that only once per 10 minutes. These messages are normal during startup, but we should not be logging them every second for cases where we will remain in recovery mode during startup for an extended period of time. Such as if suspended or permabanned. CQ S1023302 (This used to be ctdb commit 3a0af8780dc595acbed880f288fcbc4f62c862fb)
| * bonding mode 4 monitoring:Ronnie Sahlberg2011-04-131-0/+8
| | | | | | | | | | | | | | | | | | | | we can not just check if MII Status is up for bonding mode 4, since the kernel will always report the bond device as UP even if all cables are disconneccted. For mode 4, ignore the status of the bond device and instead chek if at least one slave interface is up when determining if the device is good or bad (This used to be ctdb commit a6930cec6d9503dba18b9d4839d87a1c1a8ddba2)
| * If the eventscript is finished but state->ctdb is NULL,Ronnie Sahlberg2011-04-121-0/+5
| | | | | | | | | | | | | | | | log an error and return. (Need to find root cause for this is soo too.) (This used to be ctdb commit 2e80d53b73fcba58ed5a72bab66c051691ccf719)
| * IFACE handling. Assume links are always good on nstartup (they almost alwaysRonnie Sahlberg2011-04-112-34/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify the handling of setting the links in the 10.interface eventscript and remove the optimization to only call setifacelink on state change to make the code simpler to read. If a take ip event fails, flag the node as unhealthy. Add a check to the interface script to check if the interface exists or if it has been deleted. So that we can capture and become UNHELTHY if someone deletes an interface we are using to host public addresses. (This used to be ctdb commit 4ab63d2a7262aff30d5eced184c294c9c9dd4974)
| * web: use the new git repository url on the download pageDavid Disseldorp2011-04-071-1/+1
| | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit b36818888fac7ebbed26fcdd2dd1d426e3d2f8f0)
| * NATGW: dont set arp_ignore in 11.natgw anymore since we no longerRonnie Sahlberg2011-04-061-4/+0
| | | | | | | | | | | | need this for the natgw functionality (This used to be ctdb commit bf3bf2967e3781c918e33b3a210e68e0ccca0c51)