summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * scripts: Fix script_log() regressionMartin Schwenke2013-05-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | 5940a2494e9e43a83f2bca098bd04dfc1a8f2e93 makes script_log() always pass a message to logger, so script_log() can no longer log stdin. Put all the tag fu in the actual tag so the message argument is empty if no message was passed. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9dee4c84273633b9ad82e94dabbf0e6f86edbcef)
| * initscript: Look for tdbtool/tdbdump using which, not in fixed locationsMartin Schwenke2013-05-061-4/+4
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c74cc0442eb90d859eae270b59456d28605817c4)
| * ctdbd: Log CTDB startup before creating the PID fileMartin Schwenke2013-05-061-1/+1
| | | | | | | | | | | | | | | | | | Otherwise the messages are in a stupid order... :-) Signed-off-by: Martin Schwenke <martin@meltin.net> Reported-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit cd87ba85fc6c375758c7d3dfa8dbd4d8a02074b0)
| * ctdbd: Remove the "stopped" eventMartin Schwenke2013-05-066-61/+9
| | | | | | | | | | | | | | | | It isn't used, superceded by "ipreallocated". Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c2bb8596a8af6406ef50e53953884df9d6246a96)
| * eventscripts: Remove use of "stopped" eventMartin Schwenke2013-05-062-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Use "ipreallocated" instead. The "stopped" event pre-dates the "ipreallocated" event. The only way of stopping a node is via the ctdb tool, which explicitly causes a takeover run to occur after the node is stopped. The takeover run will generate an "ipreallocated" event. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 978d4a0d6d8c9877b23f72e3a7b78c1245d16908)
| * recoverd: ctdb_takeover_run() uses CTDB_CONTROL_IPREALLOCATEDMartin Schwenke2013-05-061-4/+2
| | | | | | | | | | | | | | | | This means "ipreallocated" is now run on stopped nodes. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 83b61f7414b1f7a3424497ac987ca0724fba9eaa)
| * ctdbd: New control CTDB_CONTROL_IPREALLOCATEDMartin Schwenke2013-05-065-0/+65
| | | | | | | | | | | | | | | | | | This is an alternative to using ctdb_run_eventscripts() that can be used when in recovery. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 27a44685f0d7a88804b61a1542bb42adc8f88cb1)
| * ctdbd: Avoid freeing non-monitor event callback when monitoring is disabledMartin Schwenke2013-05-061-31/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running a non-monitor event, check is made for any active monitor events. If there is an active monitor event, then the active monitor event is cancelled. This is done by freeing state->callback which is allocated from monitor_context. When CTDB is stopped or shutdown, monitoring is disabled by freeing monitor_context, which frees callback and then stopped or shutdown event is run. This creates a new callback structure which is allocated at the exact same memory location as the monitor callback which was freed. So in the check for active monitor events, it frees the new callback for non-monitor event. Since the callback function flags successful completion of that event, it is never marked complete and CTDB is stuck in a loop waiting for completion. Move the monitor cancellation to the top of the function so that this can't happen. Follow log snippest highlights the problem. 2013/04/30 16:54:10.673807 [21505]: Received SHUTDOWN command. Stopping CTDB daemon. 2013/04/30 16:54:10.673814 [21505]: Shutting down recovery daemon 2013/04/30 16:54:10.673852 [21505]: server/eventscript.c:696 in remove_callback 0x1c6d5c0 2013/04/30 16:54:10.673858 [21505]: Monitoring has been stopped 2013/04/30 16:54:10.673899 [21505]: server/eventscript.c:594 Sending SIGTERM to child pid:23847 2013/04/30 16:54:10.673913 [21505]: server/eventscript.c:629 searching for callback 0x1c6d5c0 2013/04/30 16:54:10.673932 [21505]: server/eventscript.c:641 running callback 2013/04/30 16:54:10.673939 [21505]: server/eventscript.c:866 in event_script_callback 2013/04/30 16:54:10.673946 [21505]: server/eventscript.c:696 in remove_callback 0x1c6d5c0 Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 05f785b51cfd8b22b3ae35bf034127fbc07005be)
| * recoverd: Interface reference count changes should not cause takeover runsMartin Schwenke2013-05-021-23/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment a naive compare of the all the interface data is done. So, if any IPs move then the reference counts for the the relevant interfaces change, interfaces appear to have changed and another takeover run is initiated by each node that took/released IPs. This change stops the spurious takeover runs by changing the interface comparison to ignore the reference counts. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0b7257642f62ebd83c05b6e2922f0dc2737f175c)
| * recover: use CTDB_REC_RO_FLAGS where appropriateMichael Adam2013-04-241-13/+5
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit b5a8791268e938d7e017056e0e2bd2cbec1fa690)
| * ctdb_daemon: use CTDB_REC_RO_FLAGS where appropriateMichael Adam2013-04-241-1/+1
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c7eab97c7a939710b73aae2d75b404b235a998f5)
| * ctdb_call: use CTDB_REC_RO_FLAGS where appropriateMichael Adam2013-04-241-1/+1
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f99eb2f56d8ca27110a45ae0e1c4bff40ac7a60e)
| * vacuum: use CTDB_REC_RO_FLAGS in the vacuuming codeMichael Adam2013-04-241-10/+2
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a62775334aa20d1d850d2df705eb70303b04ac5c)
| * ltdb_server: use CTDB_REC_RO_FLAGS where appropriateMichael Adam2013-04-241-2/+2
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 61f17e53576197def46bc61fdf0cdb5282333a3e)
| * include: define CTDB_REC_RO_FLAGS - all read-only related record flagsMichael Adam2013-04-241-0/+4
| | | | | | | | | | | | | | | | | | This is used for some checks Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c7924ce6404bb18641b00d5fbd2fe9da9aaf7959)
| * vacuum: Update (C)Michael Adam2013-04-241-1/+1
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 61264debba58355b9716ac1637fdedef5ed249c8)
| * vacuum: extend the header comment for ctdb_process_delete_list()Michael Adam2013-04-241-2/+20
| | | | | | | | | | | | | | | | | | | | | | Describe the (new) process more precisely. And mention that is the last step of the vacuuming process that is performed on the lmaster. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 06de786c786f1cab4c6721adf47c2cb1e8a72adb)
| * vacuum: turn the vacuuming on lmaster into a three-phase process.Michael Adam2013-04-241-25/+278
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | More precisely, before locally deleting an empty record, that has been migrated with data and that we are dmaster and laster for, we now perform the deletion on the other nodes in two steps instead of a single step. - First send out the list of records to be deleted to all other nodes with the new RECEIVE_RECORDS control to store the lmaster's current empty copy. - Then send those records that could be deleted on all nodes to all nodes again with the TRY_DELETE_RECORDS control as before for deletion. - Finally delete those records locally that were successfully deleted remotely in the previous step. This fixes an old race where a recovery that hits the vacuum process square between the eyes can create gaps in the record's history and hence let the records resurrect. In the case of the locking.tdb, that could mean that a file that was already closed, was recorded as being open and locked again, so samba clients were locked out of that file until samba was restarted. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit eee23d44b6427be8ab49bbfcee3abb62f37dfcc7)
| * vacuum: introduce the RECEIVE_RECORDS controlMichael Adam2013-04-244-0/+209
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This in preparation of turning the vacuming on the lmaster into into a two phase process: - First the node sends the list of records to be vacuumed to all other nodes with this new RECEIVE_RECORDS control. The remote nodes should store the lmaster's empty current copy. - Only those records that could be stored on all other nodes are processed further. They are send to all other nodes with the TRY_DELETE_RECORDS control as before for deletion. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e397702e271af38204fd99733bbeba7c1db3a999)
| * vacuum: reorder some of ctdb_process_delete_list() more intuitivelyMichael Adam2013-04-241-21/+21
| | | | | | | | | | | | | | | | | | | | | | | | Now that the nodemap and its talloc children don't hang off of the delete_records_list talloc context, we can build the nodemap and earlier, and move the construction of the delete_records_list to where it is more obvious what it is used for. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e3740899c1af6962f93c85ad7d1cb71bddce45c6)
| * vacuum: add explicit temporary memory context to ctdb_process_delete_list()Michael Adam2013-04-241-5/+12
| | | | | | | | | | | | | | | | | | | | This removes the implicit artificial talloc hierarchy and makes the code easier to understand. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit b7c3b8cdf92c597e621e3dae28b110d321de5ea8)
| * vacuum: fix indentation in ctdb_process_delete_list()Michael Adam2013-04-241-2/+2
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 59a887e12469266e514ad7d4e34810e7ea888ba3)
| * vacuum: free temporary allocated memory correctly in ctdb_process_delete_list().Michael Adam2013-04-241-8/+15
| | | | | | | | | | | | | | | | | | Add a common exit point for cleanup. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 11d728465a9c635e1829abaae17e2f7720433b69)
| * vacuum: move variable into scope of use in ctdb_process_delete_list()Michael Adam2013-04-241-1/+2
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 3710dd0f313f551f1b302b4961e0203243e3d661)
| * vacuum: move variable into scope of use in ctdb_process_delete_list()Michael Adam2013-04-241-1/+1
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4640979b526b6dac69a6a0555bfce75fe0206dac)
| * vacuum: simplify ctdb_process_delete_list(): reduce indentationMichael Adam2013-04-241-113/+114
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f3e6e7f8ef22bd70dd2f101d818e2e5ab5ed3cd8)
| * vacuum: add DEBUG to skip conditions in delete_record_traverse()Michael Adam2013-04-241-5/+25
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 817c77a3d0a3546bf46389cec5f6b54778dd1693)
| * vacuum: break line for RO-flags check in delete_record_traverse() for ↵Michael Adam2013-04-241-1/+5
| | | | | | | | | | | | | | | | | | readability Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 3f7e35ff0db740cdcb6d27c43a59bb6ca6066efb)
| * client: fix ctdb_control() to be able to cope with CTDB_CTRL_FLAG_NOREPLYMichael Adam2013-04-241-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was apparently not used before in this context, and the bug hence not detected. It becomes necessary when ctdb_local_schedule_for_deletion() is called from a client ctdbd (the vacuuming child), hence needs to send the SCHEDULE_FOR_DELETION control to its parent. Pair-Programmed-With: Stefan Metzmacher <metze@samba.org> Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit e72a5e11845fe445baaee4730bb0bea8588ee9e3)
| * ctdbd: Set num_clients statistic from ctdb->num_clientsAmitay Isaacs2013-04-222-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the problem of "ctdb statisticsreset" clearing the number of clients even when there are active clients. Values returned in statistics for frozen, recovering, memory_used are based on the current state of CTDB and are not maintained as statistics. This should include num_clients as well. Currently ctdb->num_clients is unused. So use that to track the number of clients and fill in statistics field only when requested. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit dc4ca816630ed44b419108da53421331243fb8c7)
| * ctdbd: Log PID file creation and removal at NOTICE levelMartin Schwenke2013-04-221-3/+3
| | | | | | | | | | | | | | | | | | Unexpected removal of this file can have serious consequences, so it is best if this is logged at the default level. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit bfed6a8d1771db3401d12b819204736c33acb312)
| * scripts: Ensure even external scripts get tagged in logs as "ctdbd"Martin Schwenke2013-04-223-5/+5
| | | | | | | | | | | | | | | | | | Our practice is to search logs for "ctdbd:". We want to make sure we find everything. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 5940a2494e9e43a83f2bca098bd04dfc1a8f2e93)
| * eventscripts: Ensure directories are createdMartin Schwenke2013-04-221-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | Previous commits stopped the top level of the script from creating certain directories but some functions assume that required directories exist. Create those directories instead. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0076cfc4666e5a96eb2c8affb59585b090840e00)
| * scripts: Clean up update_tickles() and handling of associated directoryMartin Schwenke2013-04-191-5/+2
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 700cf95a1f29b4b88460a00a55d57a9e397011e0)
| * scripts: Use $CTDB_SCRIPT_DEBUGLEVEL instead of something more complexMartin Schwenke2013-04-196-64/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current logic is horrible and creates an unnecessary file. Let's make the script debug level independent of ctddb's debug level. * Have debug() use $CTDB_SCRIPT_DEBUGLEVEL directly * Remove ctdb_set_current_debuglevel() * Remove the "getdebug" command from ctdb stub in eventscript unit tests * Update relevant eventscript unit tests to use $CTDB_SCRIPT_DEBUGLEVEL Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 85efa446c7f5c5af1c3a960001aa777775ae562f)
| * scripts: Ensure service command is in $PATH in ctdb-crash-cleanup.shMartin Schwenke2013-04-191-5/+5
| | | | | | | | | | | | | | | | | | Move the use of the service command below inclusion of functions file, which sets $PATH. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit d254d03f69cbdc3e473202b759af6e1392cbb59c)
| * initscript: Remove duplicate setting of $ctdbdMartin Schwenke2013-04-181-2/+0
| | | | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit e7a4b7e35a1e4b826846e2494a3803abb57065ee)
| * util: Removed unused declaration of ctdbd_start()Martin Schwenke2013-04-181-1/+0
| | | | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 1e989894764e4cd1d551c44784d91cb295cd790d)
| * include: Move ctdb_start_daemon() from ctdb_client.h to ctdb_private.hMartin Schwenke2013-04-182-1/+3
| | | | | | | | | | | | | | | | | | | | It really is internal. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit abb64f62efaa70df4b87c030b96300eafd98e6a3)
| * scripts: ctdb-crash-cleanup.sh uses initscript to see if ctdbd is runningMartin Schwenke2013-04-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | "ctdb ping" can time out. How many times should we try? Instead, depend on the initscript to implement something sane. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 90cb337e5ccf397b69a64298559a428ff508f196)
| * initscript: Use a PID file to implement the "status" optionMartin Schwenke2013-04-181-30/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using "ctdb ping" and "ctdb status" is fraught with danger. These commands can timeout when ctdbd is running, leading callers to believe that ctdbd is not running. Timeouts could be increased but we would still have to handle potential timeouts. Everything else in the world implements the "status" option by checking if the relevant process is running. This change makes CTDB do the same thing and uses standard distro functions. This change is backward compatible in sense that a missing /var/run/ctdb/ directory means that we don't do a PID file check but just depend on the distro's checking method. Therefore, if CTDB was started with an older version of this script then "service ctdb status" will still work. This script does not support changing the value of CTDB_VALGRIND between calls. If you start with CTDB_VALGRIND=yes then you need to check status with the same setting. CTDB_VALGRIND is a debug variable, so this is acceptable. This also adds sourcing of /lib/lsb/init-functions to make the Debian function status_of_proc() available. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 687e2eace4f48400cf5029914f62b6ddabb85378)
| * ctdbd: Add --pidfile optionMartin Schwenke2013-04-183-1/+37
| | | | | | | | | | | | | | | | | | | | Default is not to create a pid file. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 996e74d3db0c50f91b320af8ab7c43ea6b1136af)
| * util: ctdb_fork() should call ctdb_set_child_info()Martin Schwenke2013-04-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | For now we pass NULL as the child name. Later we'll give ctdb_fork() and friends an extra argument and pass that through. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit ba8866d40125bab06391a17d48ff06a4a9f9da89)
| * util: New functions ctdb_set_child_info() and ctdb_is_child_process()Martin Schwenke2013-04-182-0/+24
| | | | | | | | | | | | | | | | | | Must be called by all child processes. Signed-off-by: Martin Schwenke <martin@meltin.net> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 59b019a97aad9a731f9080ea5be14d0dbdfe03d6)
| * tests: add a comment to recovery db corruption testMichael Adam2013-04-171-0/+7
| | | | | | | | | | | | | | | | | | The comment explains that we use "ctdb stop" and "ctdb continue" but we should use "ctdb setcrecmasterrole off". Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 06ac62f890299021220214327f1b611c3cf00145)
| * tests: Add a test for subsequent recoveries corrupting databasesAmitay Isaacs2013-04-171-0/+126
| | | | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit b1577a11d548479ff1a05702d106af9465921ad4)
| * tests: Support waiting for "recovered" state in testsAmitay Isaacs2013-04-171-1/+4
| | | | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Reviewed-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 2438f3a4944f7adbcae4cc1b9d5452714244afe7)
| * ctdb_call: don't bump the rsn in ctdb_become_dmaster() any moreMichael Adam2013-04-171-1/+1
| | | | | | | | | | | | | | | | | | | | This is now done in ctdb_ltdb_store_server(), so this extra bump can be spared. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit cad3107b12e8392f786f9a758ee38cf3a3d58538)
| * Fix a severe recovery bug that can lead to data corruption for SMB clients.Michael Adam2013-04-171-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: Recovery can under certain circumstances lead to old record copies resurrecting: Recovery selects the newest record copy purely by RSN. At the end of the recovery, the recovery master is the dmaster for all records in all (non-persistent) databases. And the other nodes locally hold the complete copy of the databases. The bug is that the recovery process does not increment the RSN on the recovery master at the end of the recovery. Now clients acting directly on the Recovery master will directly change a record's content on the recmaster without migration and hence without RSN bump. So a subsequent recovery can not tell that the recmaster's copy is newer than the copies on the other nodes, since their RSN is the same. Hence, if the recmaster is not node 0 (or more precisely not the active node with the lowest node number), the recovery will choose copies from nodes with lower number and stick to these. Here is how to reproduce: - assume we have a cluster with at least 2 nodes - ensure that the recmaster is not node 0 (maybe ensure with "onnode 0 ctdb setrecmasterrole off") say recmaster is node 1 - choose a new database name, say "test1.tdb" (make sure it is not yet attached as persistent) - choose a key name, say "key1" - all clustere nodes should ok and no recovery running - now do the following on node 1: 1. dbwrap_tool test1.tdb store key1 uint32 1 2. dbwrap_tool test1.tdb fetch key1 uint32 ==> 1 3. ctdb recover 4. dbwrap_tool test1.tdb store key1 uint32 2 5. dbwrap_tool test1.tdb fetch key1 uint32 ==> 2 4. ctdb recover 7. dbwrap_tool test1.tdb fetch key1 uint32 ==> 1 ==> BUG This is a very severe bug, since when applied to Samba's locking.tdb database, it means that for SMB clients on clustered Samba there is the potential for locking out oneself from previously opened files or even worse, data corruption: Case 1: locking out - client on recmaster opens file - recovery propagates open file handle (entry in locking.tdb) to other nodes - client closes file - client opens the same file - recovery resurrects old copy of open file record in locking.tdb from lower node - client closes file but fails to delete entry in locking.tdb - client tries to open same file again but fails, since the old record locks it out (since the client is still connected) Case 2: data corruption - clien1 on recmaster opens file - recovery propagates open file info to other nodes - client1 closes the file and disconnects - client2 opens the same file - recovery resurrects old copy of locking.tdb record, where client2 has no entry, but client1 has. - but client2 believes it still has a handle - client3 opens the file and succees without conflicting with client2 (the detached entry for client1 is discarded because the server does not exist any more). => both client2 and client3 believe they have exclusive access to the file and writing creates data corruption Fix: When storing a record on the dmaster, bump its RSN. The ctdb_ltdb_store_server() is the central function for storing a record to a local tdb from the ctdbd server context. So this is also the place where the RSN of the record to be stored should be incremented, when storing on the dmaster. For the case of the record migration, this is currently done in ctdb_become_dmaster() in ctdb_call.c, but there are other places such as in recovery, where we should bump the RSN, but currently don't do it. So moving the RSN incrementation into ctdb_ltdb_store_server fixes the recovery-record-resurrection bug. Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-By: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit feb1d40b21a160737aead22e398f3c34ff3be8de)
| * logging: fix comment typoMichael Adam2013-04-171-1/+1
| | | | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> Reviewed-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 4c0cbfbe8b19f2e6fe17093b52c734bec63dd8b7)