summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * tests/simple: Unreachable node test should wait for recovery to completeMartin Schwenke2013-08-141-0/+2
| | | | | | | | | | | | | | | | | | This should minimise the chances of a control timing out. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 63be516673c5d9c0d543617bf1bb8bca919956a8)
| * tests/simple: Fix the missing IP testMartin Schwenke2013-08-141-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the missing IP test to wait until restarts are complete. Otherwise a service restart can collide with the following monitor event and cause chaos. Also, do not disable 10.interface until it matters. Disabling it too early can cause even more chaos if something goes wrong with the monitor step. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4e3bd06916bd3adac213fb18c7c2a24854b02d45)
| * recoverd: Use TDB_INCOMPATIBLE_HASH when creating volatile databasesAmitay Isaacs2013-08-141-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | When creating missing databases either locally or remotely, recovery master calls ctdb_ctrl_createdb(). Recovery master always passes 0 for tdb_flags. For volatile databases, if TDB_INCOMPATIBLE_HASH is not specified, then they will be attached without using jenkins hash causing database corruption. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2fc6b6403707a292d134140fc0b9145b454992c5)
| * Revert "recoverd: Use correct tdb flags when creating missing databases"Amitay Isaacs2013-08-143-10/+8
| | | | | | | | | | | | | | | | | | | | | | This reverts commit 10a057d8e15c8c18e540598a940d3548c731b0b4. This approach would not work when creating local databases since currently there is no control to receive TDB flags for remote databases. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ca61eb776ab862bd269e45ee0f9f96e7e1e0e001)
| * common/io: Keep queue buffer size multiple of 4KAmitay Isaacs2013-08-091-6/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently queue buffer size is realloc'd every time we need to extend the buffer. Small increments can cause memory fragmentation. Instead always extend buffer in multiples of 4K. This should reduce multiple talloc_realloc calls when there are lots of packets in the socket buffer. Also, if queue buffer has grown larger than 64K, throw away the buffer once all the requests in the queue have been processed. That way queue does not hold on to large buffers. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 5e9b1a7e24d058ff88aaa0563db36a804e866fa9)
| * packaging: Allow setting custom release number in RPM spec fileMartin Schwenke2013-08-094-8/+16
| | | | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-Programmed-With: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 867afb247bd8cc86c8d738f051a44cc534cafacf)
| * ctdbd: When a record is made sticky, log only onceAmitay Isaacs2013-08-091-2/+3
| | | | | | | | | | | | | | | | | | | | Instead of logging from ctdb_request_call(), log the message from ctdb_make_record_sticky(). That way if the record is already sticky, the message is not repeated unnecessarily. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 44a64d1c388bfe3c3388b191edfaedecfb7bb831)
| * ctdbd: Improve high hopcount log messages when request is redirectedAmitay Isaacs2013-08-091-5/+5
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 9cde47e1a5bf1b9ca3b4da8c2db94caac2b1aa5e)
| * scripts: Do not run ctdb tool commands when debugging hung "init" eventMartin Schwenke2013-08-091-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | CTDB daemon is not ready to accept clients in INIT runstate (init event). CTDB daemon will start accepting connections in SETUP runstate (setup event) and later. Also, minor log formatting changes. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 81d7ce03b28d592a1337639e14d9ea141e20bfff)
| * ctdbd: Avoid leaking file descriptor if talloc failsAmitay Isaacs2013-08-091-1/+4
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit d7f6bc3fed2dc61e6e587b4c0ec0ac27d533bbbe)
| * eventscript: Wait for debug hung script to finish or timeout before continuingAmitay Isaacs2013-08-091-13/+59
| | | | | | | | | | | | | | | | | | Currently if the debug hung script takes long time to finish, the subsequent monitor event can collide with the previous event which is not yet finished. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 9e99e0eb072e2b845914ee3896acbc66b96138d7)
| * eventscripts: Use configured RECLOCK file instead of asking CTDBAmitay Isaacs2013-08-091-7/+5
| | | | | | | | | | | | | | | | | | | | | | On cluster where recovery lock file is not being used, asking CTDB daemon is unnecessary overhead. And if CTDB is using recovery file, then changing configuration without restarting is *stupid*. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 44eb86e6042adb6efe75d2a5528b82a0f21d496d)
| * locking: Do not create multiple lock processes for the same keyAmitay Isaacs2013-08-091-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | If there are multiple lock helper processes waiting for the same record, then it will cause a thundering herd when that record has been unlocked. So avoid scheduling lock contexts for the same record. This will also mean that multiple requests will get queued up behind the same lock context and can be processed quickly once the lock has been obtained. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ebecc3a18f1cb397a78b56eaf8f752dd5495bcc9)
| * locking: Move function find_lock_context() before ctdb_lock_schedule()Amitay Isaacs2013-08-091-53/+53
| | | | | | | | | | | | | | | | | | So that ctdb_lock_schedule() can call this function without requiring extra prototype declaration. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 68af5405acc123b5a90decd2123e2a02961a8fcf)
| * ctdbd: Print set db sticky message after it's setAmitay Isaacs2013-08-011-3/+2
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 824dcec35ec461d78e22b2ea109473b32bfe3972)
| * tests: Add a test program to hold a lock on a databaseAmitay Isaacs2013-08-012-1/+47
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f6b066a23610fb0092298861c21a9b354b91e2f1)
| * recoverd: Use correct tdb flags when creating missing databasesAmitay Isaacs2013-08-013-8/+10
| | | | | | | | | | | | | | | | | | | | When creating missing databases either locally or remotely, make sure to use the correct tdb flags from other nodes. Without this, volatile databases can get attached without TDB_INCOMPATIBLE_HASH flag. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 10a057d8e15c8c18e540598a940d3548c731b0b4)
| * client: Always use jenkins hash when attaching volatile databasesAmitay Isaacs2013-08-011-0/+8
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7e7e59c4047c78159387089eca65d90037bcf722)
| * recoverd: Make sure to use jenkins hash for recovery databasesAmitay Isaacs2013-08-011-1/+1
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 32c83e209823e9a4d6306bb7fd63d4500f3e2668)
| * recoverd: Assemble up-to-date node flags information from remote nodesAmitay Isaacs2013-07-301-0/+17
| | | | | | | | | | | | | | | | | | | | | | Currently nodemap used by recovery master is the one obtained from the local node. This information may have been updated while processing main loop. Before comparing node flags on all the nodes, create up-to-date node flags information based on the information received from all the nodes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit fcf77dec5af973a0e32f3999bc012053a6f47a96)
| * tools/ctdb: Only print the hot records with non-zero hopcountAmitay Isaacs2013-07-301-0/+9
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 049d9beb3783482490e6273a434ccbad23f85f0a)
| * ctdbd: Don't consider a hot record if the hopcount is zeroAmitay Isaacs2013-07-301-0/+3
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ab35773518ad15588013f4d859f7bee790437450)
| * ctdbd: Fix updating of hot keys in database statisticsAmitay Isaacs2013-07-291-7/+13
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit fde4b4db5a57f75c5efa5647c309f33e0d5a68f3)
| * ctdbd: Remove incomplete ctdb_db_statistics_wire structureAmitay Isaacs2013-07-293-47/+21
| | | | | | | | | | | | | | | | | | Instead of maintaining another structure, add an element as place holder for marshall buffer of hot keys. This avoids duplication of the structure. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e73b2e12adc9db1dedb48d32bba3a8406a80f4cd)
| * Revert "ctdbd: Remove incomplete ctdb_db_statistics_wire structure"Amitay Isaacs2013-07-295-16/+95
| | | | | | | | | | | | | | | | | | | | | | The structure cannot be removed without adding support for marshalling keys for hot records. This reverts commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 023ca2e84f5ed064a288526b9c2bc7e06674dd81)
| * doc: Update XML files to use standard DocBook DTDMartin Schwenke2013-07-295-5/+15
| | | | | | | | | | | | | | | | | | This simplifies building since we don't use any of the Samba extensions. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 57aa2dffea60abd73a95233f8b761cc676adebb6)
| * initscript: The wrapper script should export CTDB_SOCKETMartin Schwenke2013-07-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This ensures that any invocation of the ctdb tool (within the wrapper) gets the desired value. This at least ensures that ctdbd will be started. If a non-standard value is set for CTDB_SOCKET then command-line users will still need the variable in their environment. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 37ccc7c6cc43a80aaa92291aea7a438f4225488a)
| * ctdbd: Kill client process without checking for tracked childMartin Schwenke2013-07-291-1/+1
| | | | | | | | | | | | | | | | | | | | Commit f73a4b1495830bcdd094a93732a89dd53b3c2f78 added a safety check to ensure that CTDB never kills unrelated processes. However, client processes are unrelated. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 782814288bb560099ee44b607bf35f3eddf37f82)
| * eventscripts: kill_tcp_connections() should send connections to stdinMartin Schwenke2013-07-296-11/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This avoids issuing multiple "ctdb killtcp" commands to terminate tcp connections, one per connection. This will considerably reduce the time when there is a large number of tcp connections. This also makes it possible to avoid calling "ctdb killtcp" when there are no connections. Add a couple of unit tests for killtcp and update eventscript unit test infrastructure to support. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a20d94717d2e4ab866d8a002cdf39c0669b74c6a)
| * tools/ctdb: Allow killtcp to read connections from standard inputMartin Schwenke2013-07-292-5/+117
| | | | | | | | | | | | | | | | | | | | | | This will allows eventscripts to send information about multiple tcp connections to a single "ctdb killtcp" command, saving the overhead of setting up a client connection per tcp connection. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit af5aa369c266430fe912df0c26116b68bac3572e)
| * tests: Always tally the number of passed/failed testsMartin Schwenke2013-07-291-2/+5
| | | | | | | | | | | | | | | | Regardless of whether a summary is being printed! Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a69e03a5e4671e998d45b4fef8611a421bbdb3e1)
| * recoverd: Call takeover fail callback only once per nodeMartin Schwenke2013-07-291-2/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently the fail callback is called once per (takeip/releaseip) control failure. This is overkill and can get a node banned much too quickly. Instead, keep track of control failures per node and only call fail callback once per failed node. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit bf4a7c1ad87e0e848296d15d63eb8cd901ca5335)
| * scripts: Run scriptstatus for hung eventMartin Schwenke2013-07-291-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The timeout information printed by ctdbd is less than useful because it refers to the cumulative time taken by the eventscripts run so far. Adding scriptstatus output indicates where time was actually spent. Since there is now quite a bit of output, serialise the calls to this script using flock. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1b016b2dfc5d7d3f2a42ce4dfe569608e90eb714)
| * ctdbd: Pass event name to hung script debuggerMartin Schwenke2013-07-231-2/+3
| | | | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e0f3fa1020e13b84bdd672538168d148f1847d57)
| * tests/complex: Fix NFS tests to work with root_squashMartin Schwenke2013-07-234-49/+50
| | | | | | | | | | | | | | | | | | Refactor the NFS test setup/cleanup code into new common functions. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 29e98017221326bdc9b1c4f7c05b3b495c1de29b)
| * tests: Fix exit status of run_tests when a single test is run with -HMartin Schwenke2013-07-221-6/+6
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9d6e1c147bd036d832b98c155f405ee2a5d6f57f)
| * tests/simple: Add -p in onnode test to help show groups of connectionsMartin Schwenke2013-07-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | Change the command from "true" to "hostname" since the former won't produce any output when used in combination with "onnode -p". This could just be changed to "echo" but the hostname might actually be useful. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ae3c03d80264e997b7da9f3279d7810e18b8a1df)
| * ctdbd: Sleep at exit to allow time for log messages to flushMartin Schwenke2013-07-191-4/+9
| | | | | | | | | | | | | | | | | | | | Register print_exit_message() earlier so that it covers most of the early exits. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 90d792cf28d6a823141e4c417b6978f02a9cf596)
| * ctdbd: Exit if something is already listening on CTDB socketMartin Schwenke2013-07-191-9/+18
| | | | | | | | | | | | | | | | Don't blindly remove the socket. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3dd5b925dcf0e9a5b877638e471c5ecf36b46c58)
| * tests/eventscripts: Add tests for monitoring of missing interfacesMartin Schwenke2013-07-194-54/+108
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 53e4eca74429f76adc81d98e3d11d1bd61194d71)
| * eventscripts: A missing interface should cause monitoring to failMartin Schwenke2013-07-191-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A missing interface is at least as bad as an interface with a link that is down so should have a similar effect. This couldn't be done previously because orphaned interfaces used to be listed for monitoring. This was worked around in 10.interface in commit 49b2d1bd9554461ed8edbfc21e777c0eca9e1443 and fixed in ctdbd in commit cc1a3ae911d3fee8b87fda5de5ab6d9499d7510a. If $CTDB_PARTIALLY_ONLINE_INTERFACES="yes" then monitoring won't actually fail but the interface is still marked as down. While we're touching this code, use "ip link" instead of "ip addr". It is marginally cheaper but not enough for a separate patch. ;-) This effectively reverts d67955b42f7627be9dae995230c8fcbb8a948ec2. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 501f19b16fd6d67fbb754248868c38ee5bcf79ef)
| * eventscripts: Get list of configured interfaces using "ctdb ifaces"Martin Schwenke2013-07-191-3/+3
| | | | | | | | | | | | | | | | | | | | This was previosuly changed because ctdbd didn't garbage collect orphaned interfaces. This was fixed in commit cc1a3ae911d3fee8b87fda5de5ab6d9499d7510a. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c6ab0f9405d5fa5b0b1693bc92e59da0d555a9d7)
| * ctdbd: Allow extra recovery to repair persistent DBs during first recoveryMartin Schwenke2013-07-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | Commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28 introduced a potential regression because a node may not have completed the "recovered" event (so might still be in CTDB_RUNSTATE_FIRST_RECOVERY) when another node becomes healthy. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 57ef5d3827ea3417a32703e259a53ce6fd10ac45)
| * packaging: Bundle debug_locks.sh script in RPMAmitay Isaacs2013-07-162-0/+2
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 5740155cc5de1a223412e8529aa1a383a5412514)
| * packaging: No need to check for existence of scripts, they always doAmitay Isaacs2013-07-161-3/+3
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 67c227a5d30cb8487b20b19b20bdfa4613906609)
| * scripts: ctdbd_wrapper logs a message to syslog if syslog is not being usedMartin Schwenke2013-07-111-0/+8
| | | | | | | | | | | | | | | | | | It can be very disconcerting when logging to syslog is expected but nothing is being logged there. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 412bc0e20bef694d4e911dc9c984fd7716231f1f)
| * Update Nagios check to work with ctdb versions past 30 Aug 2011Mathieu Parent2013-07-111-1/+5
| | | | | | | | | | | | | | | | Because of commit a779d83a6213e2ba Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit a4afe7af9c9391048d6f80135bbd5e15367770c7)
| * recoverd: Really fix bogus info in message about changed flagsMartin Schwenke2013-07-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Commit 9119a568c2b4601318f7751f537dca2f92a7230b attempted to fix this. However, this was wrong because old_flags and new_flags were confused. The latter has since been fixed in commit 7eb2f89979360b6cc98ca9b17c48310277fa89fc so this can now be fixed properly. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 40f2825d6e818dc8c745b6385a545969dfb45fbc)
| * doc: Update NEWSMartin Schwenke2013-07-111-0/+2
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 76703514040b804b880cab909f6ff52576f80f89)
| * Print deleted nodes as wellSumit Bose2013-07-111-1/+12
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 0930a3b806977555509c3228726e2250aef1f971)