summaryrefslogtreecommitdiffstats
path: root/ctdb
Commit message (Collapse)AuthorAgeFilesLines
...
* | config/10.interface: use delete_ip_from_iface also in the "init" eventStefan Metzmacher2010-02-231-1/+1
| | | | | | | | | | | | metze (This used to be ctdb commit e2bc5c25116747c58505fe1cb3e2d164257377d1)
* | config/11.natgw: use delete_ip_from_iface() instead of remove_ip()Stefan Metzmacher2010-02-231-2/+5
| | | | | | | | | | | | | | | | | | This also initializes the variables correctly for the shutdown|removenatgw code path to delete_all. metze (This used to be ctdb commit 2c2cbed4fcbc868a990fa6b32fc96126ffc61bb5)
* | config: make remove_ip() a wrapper of delete_ip_from_iface()Stefan Metzmacher2010-02-231-18/+8
| | | | | | | | | | | | metze (This used to be ctdb commit e66d6636b80e3614f183366ec92fc3c6d5c323da)
* | config: interface_modify states in a $CTDB_BASE/state/interface_modify directoryStefan Metzmacher2010-02-231-6/+27
| | | | | | | | | | | | metze (This used to be ctdb commit 756c8b953fef7132dae74b5b244baeb3108dec54)
* | config: add setup_iface_ip_readd_script() helper functionStefan Metzmacher2010-02-232-4/+89
| | | | | | | | | | | | | | | | | | | | This adds a generic infrastructure to register scripts which will be called when the delete_ip_from_iface() funtion needs to readd secondary ips to an interface. metze (This used to be ctdb commit ac97d65f44e8dc8bf2ec8f68e4db3448521755a2)
* | config: readd ips with a broadcast address in delete_ip_from_iface()Stefan Metzmacher2010-02-231-1/+1
| | | | | | | | | | | | metze (This used to be ctdb commit e7a6f64cf5bce5abdc47f5db96b286c5a8d66aff)
* | In ctdb_control_end_recovery,Ronnie Sahlberg2010-02-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to talloc_steal c (the command packet) and make it a child of the "event script state context". If we failed to create a eventscript child context for some reason, this would have talloc freed state, but at the same time it would also implicitely have freed c. Once ctdb_control_end_recovery() returns the error back to the caller, the caller would dereference both c, and also outdata which is a child of c and we would either read garbage data or segv. Change the ordering so we only talloc_steal c as a child of state IFF we have successfully created a child context for the script. BZ61068 (This used to be ctdb commit 259054c3632e42bbaa614ee7e888e6e850733d60)
* | Make sure that the natgw eventscript also triggers on the "stopped" eventRonnie Sahlberg2010-02-231-1/+1
|/ | | | | | | | to remove the natgw configuration and ip assignments used. BZ61036 (This used to be ctdb commit 344b1f95b126ecabeb4576330038b08bf88e8cb8)
* ctdb regsrvids is much more useful for testing if it sleeps once it has ↵Ronnie Sahlberg2010-02-221-0/+2
| | | | | | | | registered its srvid. Othervise, as soon as it terminates, ctdbd will deregister the id automatically. (This used to be ctdb commit 23b059dcb8074872d7900b225790d4df7da071b6)
* From Sumit Bose <sbose@redhat.com>Ronnie Sahlberg2010-02-221-7/+13
| | | | | | Fixes for init script to meet guidelines (This used to be ctdb commit 9f484404030211df85a215fd2280568a2ec020fb)
* From Elia Pinto <gitter.spiros@gmail.com>Ronnie Sahlberg2010-02-221-3/+0
| | | | | | We dont need to include getopt.h under AIX (This used to be ctdb commit fcebbc3484ce56c57def745ea51c053dfb02a657)
* Ignore any scripts that timesout for most events, except startup.Ronnie Sahlberg2010-02-161-1/+15
| | | | | | Threat hung scripts always (except startup) as success. (This used to be ctdb commit b6d939c9758c7d2e39206838492f2f644dd61db7)
* try to restart rpc-rquotad if it is not runningRonnie Sahlberg2010-02-161-0/+10
| | | | | | bz60317 (This used to be ctdb commit 2263cd74d511247debadd0f6602bc6396b46ac5e)
* Leave sequence number alone when merely migrating records.Rusty Russell2010-02-161-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Based on earlier version from Ronnie which modified tdb; this one is standalone). When storing records in a tdb that has "automatic seqnum updates" also check if the actual data for the record has changed or not. If it has not changed at all, except for possibly the header, this is likely just a dmaster migration operation in which case we want to write the record to the tdb but we do not want the tdb sequence number to be increased. This resolves the problem of notify.tdb being thrashed under load: the heuristic in smbd to only reread this when the sequence number increases (rarely) breaks down. Before, running nbench --num-progs=512 across 4 nodes, we saw numbers like: 512 1496 118.33 MB/sec execute 60 sec latency 0.00 msec And turning on latency tracking, this was typical in the logs: ctdbd: High latency 9380914.000000s for operation lockwait on database notify.tdb After this commit: 512 2451 143.85 MB/sec execute 60 sec latency 0.00 msec And no more latency messages... Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 9ed2f8b2fcb7e3f0d795eef22cfa317066490709)
* Reduce loglevel for two eventscript related debug messagesRonnie Sahlberg2010-02-161-2/+2
| | | | (This used to be ctdb commit f8994790e65baebb81bbfad646cdda6234b6d29a)
* Reducing the log level for a debug messageRonnie Sahlberg2010-02-161-1/+1
| | | | | | DEBUG(DEBUG_DEBUG,("pnn %u starting migration of %08x t\ (This used to be ctdb commit 6ce4b21b00cce1530aff022584bf695c257a5d55)
* Reduce the log level for two debug messagesRonnie Sahlberg2010-02-161-2/+2
| | | | | | | DEBUG(DEBUG_DEBUG,("pnn %u dmaster response %08x\n", ctdb->pnn, ctdb_has DEBUG(DEBUG_DEBUG,("pnn %u dmaster request on %08x for %u from %u\n", (This used to be ctdb commit a3473e7a445b14520a49585c460429dfbfe1fce0)
* Add a variable CTDB_CHECK_SWAP_IS_NOT_USED="yes"Ronnie Sahlberg2010-02-161-5/+7
| | | | | | | | | | | to control whether or not to check if we are swapping, and produce useful output into the logfile if we are. For production systems with dedicated nas-heads we should never swap. But for developer/test systems we often use smaller nondedicated systems where we can no longer guarantee that we will not be using swap. (This used to be ctdb commit db87849bf3380914a63a626412bec209dbea7d20)
* lower the loglevel for a debug message for redundant releases of public ipsRonnie Sahlberg2010-02-161-1/+1
| | | | (This used to be ctdb commit cfc1a4f878b61c85063af649d2339431e799647d)
* Add a new variable : CTDB_NFS_SKIP_KNFSD_ALIVE_CHECKRonnie Sahlberg2010-02-161-1/+3
| | | | | | | | when set to "yes" this will skip checking if knfsd has hung or not. bz59626 (This used to be ctdb commit b0bf3794753c5bb898295b5109707953cc3dcec5)
* fixed printing of high latencyAndrew Tridgell2010-02-161-1/+1
| | | | (This used to be ctdb commit 88aacab30a36d66fe03d120bbf655edfe791ec32)
* Test suite: Make "ctdb ip" test backward compatible with older ctdb versions.Martin Schwenke2010-02-101-14/+7
| | | | | | | | | | | | Recent updates to the test meant that it only worked with the latest ctdb versions. This changes things so that we never bother matching the machine readable header, just the actual data in the output. It also takes a slightly more liberal approach in massaging the human readable output to ensure it matches the machine readable output. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 8a1cb5dc1ddf82f3b9cbb23e40b3914b3d5c2783)
* Merge commit 'origin/master'Martin Schwenke2010-02-104-94/+133
|\ | | | | | | (This used to be ctdb commit 19523fbb12db1ec1e5ee38de1b2d3b99a74c6ca4)
| * commands that relate to manual failover of ip addresses (moveip)Ronnie Sahlberg2010-02-091-3/+3
| | | | | | | | | | | | can sometimes take long so allow for a longer timeout for the controls used. (This used to be ctdb commit 144c69b633eeb17e120f962162feed6de3dc16a6)
| * dont just exit(0) upon successful completion of waiting for an ipreallocate ↵Ronnie Sahlberg2010-02-091-1/+9
| | | | | | | | | | | | | | | | | | | | to finish. return success back to the caller instead. otherwise things like 'ctdb enable -n all' will just finish after the first disabled node has become enabled. (This used to be ctdb commit f4eb41cd3a1099da8265351818fba9bd4688a188)
| * event scripts: add logging for low memory conditionsRusty Russell2010-02-091-0/+10
| | | | | | | | | | | | | | | | We should never enter swap; if we do, show the memory state of the machine and the process list. This will help us diagnose what caused the condition before it's too late and the box starts OOM-killing processes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (This used to be ctdb commit 627a6d67a0e9e61f8713e62695b3518c51909230)
| * ctdb: migrate to new dlinklist.h from SambaAndrew Tridgell2010-02-092-90/+111
| | | | | | | | (This used to be ctdb commit f63c091f12f8d582e9518673365c7c52479c470c)
* | onnode documentation - update documentation to reflect recent onnode changes.Martin Schwenke2010-02-053-25/+63
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 2fb2eb0fd7396de33474ce43fe95c66a5784d05b)
* | Merge branch 'master' of git://git.samba.org/sahlberg/ctdbMartin Schwenke2010-02-0515-25/+170
|\| | | | | | | (This used to be ctdb commit a442668923d4d8f8d624e00138fe37d76d593d21)
| * ctdb: when we fill the client packet queue we need to drop the clientAndrew Tridgell2010-02-041-5/+12
| | | | | | | | | | | | | | | | | | | | We can't just drop packets to the list, as those packets could be part of the core protocol the client is using. This happens (for example) when Samba is doing a traverse. If we drop a traverse packet then Samba hangs indefinately. We are better off dropping the ctdb socket to Samba. (This used to be ctdb commit a7a86dafa4d88a6bbc6a71b77ed79a178fd802a6)
| * ctdb: move ctdb_io.c to use TLIST_*() macrosAndrew Tridgell2010-02-041-21/+6
| | | | | | | | | | | | This will make large packet queues much more efficient (This used to be ctdb commit e3f198056230073135ea6354bbef30c5bb022f8f)
| * util: added TLIST_*() macrosAndrew Tridgell2010-02-041-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | The TLIST_*() macros are like the DLIST_*() macros, but take both a head and tail pointer for the list. This means that adding an element to the end of the list is efficient (it doesn't need to walk the list). We should move all uses of the DLIST_*() macros which use DLIST_ADD_END() to use the TLIST_*() macros instead. (This used to be ctdb commit 2d05a71349e9ade869b62cf261c2a9a21818a474)
| * When trying to enable/disable a node.Ronnie Sahlberg2010-02-041-0/+20
| | | | | | | | | | | | | | Check if the node is already enabled/disabled and log an information message if so. (This used to be ctdb commit c3eec8f10764a647106087099eeb47b7196f7aac)
| * We only queued up to 1000 packets per queue before we start droppingRonnie Sahlberg2010-02-042-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | packets, to avoid the queue to grow excessively if smbd has blocked. This could cause traverse packets to become discarded in case the main smbd daemon does a traverse of a database while there is a recovery (sending a erconfigured message to smbd, causing an avalanche of unlock messages to be sent across the cluster.) This avalance of messages could cause also the tranversal message to be discarded causing the main smbd process to hang indefinitely waiting for the traversal message that will never arrive. Bump the maximum queue length before starting to discard messages from 1000 to 1000000 and at the same time rework the queueing slightly so we can append messages cheaply to the queue instead of walking the list from head to tail every time. (This used to be ctdb commit 59ba5d7f80e0465e5076533374fb9ee862ed7bb6)
| * add two new debug controls to send and receive messagesRonnie Sahlberg2010-02-041-0/+65
| | | | | | | | | | | | ctdb msglisten and msgsend (This used to be ctdb commit 8c89aac20260dc7f3746e29fe99f17422a77cb88)
| * Drop the debug level for logging fd creation to DEBUG_DEBUGRonnie Sahlberg2010-02-049-10/+10
| | | | | | | | (This used to be ctdb commit eae1d4f9e52e73b4d8769868fffdafa590d03784)
| * tdb: fix an early release of the global lock that can cause data corruptionVolker Lendecke2010-02-021-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There was a bug in tdb where the tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1); (ending the transaction-"mutex") was done before the /* remove the recovery marker */ This means that when a transaction is committed there is a window where another opener of the file sees the transaction marker while the transaction committer is still fully functional and working on it. This led to transaction being rolled back by that second opener of the file while transaction_commit() gave no error to the caller. This patch moves the F_UNLCK to after the recovery marker was removed, closing this window. (This used to be ctdb commit 898b5edfe757cb145960b8f3631029bfd5592119)
* | eventscripts: stop loadconfig function from loading ctdb config file twice.Martin Schwenke2010-01-221-4/+3
| | | | | | | | | | | | | | | | | | If "$1" was empty than loadconfig would load the ctdb config twice. This stops that from happening. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0406d406da70aaee7ad6aac236114905c5d03ed2)
* | eventscript: Use of $NFS_TICKLE_SHARED_DIRECTORY must be after loadconfig.Martin Schwenke2010-01-221-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | Proper fix for 085d1bea78fabf754ef6dd6d323f74a1d361e45c's workaround. $NFS_TICKLE_SHARED_DIRECTORY was being used before it is set via loadconfig. Ronnie actually spotted this one. :-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ee8b2e298351d05197a2e1494f3331433644c1e6)
* | initscript: Remove bash-ism.Martin Schwenke2010-01-221-1/+1
|/ | | | | | | | | Also, change the order of the comparison so it is consistent with others in the script. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 44696e15cdb23e7656d3bb0ead54f509495738a7)
* initscript: handle spaces in option values inserted into $CTDB_OPTIONS.Martin Schwenke2010-01-221-7/+8
| | | | | | | | | | | | | | | | | | | This puts single quotes around everything and uses eval on the command-lines that actually start ctdbd. The eval causes the single quotes to be interpreted. The "redhat" init style no longer uses the Red Hat daemon function. It loses the quoting and re-splits on spaces. Instead we add an extra line that uses the success/failure functions to keep things pretty. Note that this means that we don't respect daemon's $DAEMON_COREFILE_LIMIT variable but we do our own core file handling with $CTDB_SUPPRESS_COREFILE anyway. daemon's core file handling was probably overriding what we were doing anyway, so this can be regarded as a bug fix. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 522fbb012524fe41a67dbe43589a282dda6bcbe2)
* onnode: update algorithm for finding nodes file.Martin Schwenke2010-01-211-2/+15
| | | | | | | | | | | | | | | | | 2 changes: * If a relative nodes file is specified via -f or $CTDB_NODES_FILE but this file does not exist then try looking for the file in /etc/ctdb (or $CTDB_BASE if set). * If a nodes file is specified via -f or $CTDB_NODES_FILE but this file does not exist (even when checked as per above) then do not fall back to /etc/ctdb/nodes ((or $CTDB_BASE if set). The old behaviour was surprising and hid errors. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 60aa570aaa77d293b963105b3f605f9625a4594b)
* onnode - respect $CTDB_BASE rather than hard-coding /etc/ctdb.Martin Schwenke2010-01-211-3/+5
| | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 503e4908b3028330bc25dc6de8561dbd53ee6a8d)
* config: 10.interface: search "ethtool" in $PATH instead of using a hardcoded ↵Stefan Metzmacher2010-01-201-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | path This is very useful for testing, I use such a script: cat ~/bin/ethtool #!/bin/sh IFACE=$1 case "$IFACE" in Neth2) ;; Neth3) ;; Neth4) ;; Neth5) ;; *) exec /usr/sbin/ethtool $@ ;; esac ip link set down $IFACE exec /usr/sbin/ethtool $@ metze (This used to be ctdb commit 3bab985cf615720eded4d47b4f9f37a9c28840aa)
* server: reload the public addresses before doing a takeover runStefan Metzmacher2010-01-201-47/+108
| | | | | | metze (This used to be ctdb commit 0e41a2204fa8a1e77dc83c0d4b253ab272b5c72d)
* server: ban ourself if the ctdb and kernel knowledge of a public ip differsStefan Metzmacher2010-01-201-2/+28
| | | | | | metze (This used to be ctdb commit 48e0af91113d6cead6cae3f28d8d8f610cacaa71)
* server: give an error if we're getting an takeover_ip event with a wrong pnnStefan Metzmacher2010-01-201-0/+8
| | | | | | metze (This used to be ctdb commit 2f44d6f3d290cc1b37b19ec34edfbad12cc0c0a7)
* server: return an error if we get an takeover ip event and we cannot serve ↵Stefan Metzmacher2010-01-201-3/+13
| | | | | | | | the ip metze (This used to be ctdb commit f5c221e6abc118aefa489aa7e07755af952fd2bb)
* server: print node number as signed integer on release ip eventStefan Metzmacher2010-01-201-1/+1
| | | | | | metze (This used to be ctdb commit 6c456face30606641f6b8beaad3121c9b05ca763)
* server: debug redundant takeover ip events with level INFOStefan Metzmacher2010-01-201-0/+4
| | | | | | metze (This used to be ctdb commit 7bc9969c4c28f2c4a4848bd730db3c63bb9204fe)