summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * eventscripts: New configuration variable $CTDB_SKIP_GANESHA_NFSD_CHECKMartin Schwenke2013-07-051-1/+3
| | | | | | | | | | | | | | | | | | This allows 60.ganesha to be unit tested, except for the core Ganesha monitoring code. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f606df4f2db754592e6d1a16c26e155cacb2beef)
| * eventscript: Move Ganesha nfsd monitoring to a functionMartin Schwenke2013-07-051-51/+59
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ceb5b2d37f7ab4894908ec26f3812b3bed991525)
| * eventscripts: Drop RPC service version from nfs_check_rpc_service() callsMartin Schwenke2013-07-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | Support for this was removed in commit 77302dbfd85754e02559eccb2dd6c090db0b6b9f and I overlooked its use in 60.ganesha. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 520914e7ee1b879c1080e5857fda18ed5b973fd6)
| * ctdbd: Log something when releasing all IPsMartin Schwenke2013-07-051-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment this is silent and it can be confusing to see IPs just disappear. Also, this message: Been in recovery mode for too long. Dropping all IPS can cause anxiety when all IPs should already have been dropped. Adding a comforting message saying that 0 IPs were dropped relieves such anxiety. :-) Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 4d0f26b306fc465d551d340b0e7dce4412eae3fd)
| * recoverd: Minor style improvements for ctdb_reload_remote_public_ips()Martin Schwenke2013-07-051-20/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | * Add a variable to the loop to make the code more readable and have it generally fit into 80 columns. * Improve comments. * Improve log messages. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 0a292fa8939a1343e44cadaa8ed9f3c0f18ca82f)
| * recoverd: Clean up log messages in remote IP verificationMartin Schwenke2013-07-053-11/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The log messages in verify_remote_ip_allocation() are confusing because they don't include the PNN of the problem node, because it is not known in this function. Add the PNN of the node being verified as a function argument and then shuffle the log messages around to make them clearer. Also fold 3 nested if statements into just one. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit f0942fa01cd422133fc9398f56b4855397d7bc86)
| * recoverd: Fix an unclear log message - "Restart recovery process"Martin Schwenke2013-07-052-2/+2
| | | | | | | | | | | | | | | | | | | | | | When the recovery master notices a node in recovery mode it starts the recovery process, it doesn't restart it. Update documentation to match. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 298c4d2c3b4ea3d900c91f5a0a5aca2952a13d61)
| * recoverd: Fix an incorrect commentMartin Schwenke2013-07-051-3/+1
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 9f6cd8b0bea619991c9f3bf35188c5950dabf8f4)
| * ctdbd: Use ctdb_die() on "setup" event failureMartin Schwenke2013-07-051-2/+1
| | | | | | | | | | | | | | | | This is slightly easier to read because it all fits on 1 line. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 035bf3eecf99337c84d4ad16cdbf297b1fa037db)
| * ctdbd: Avoid a core dump when "init" event failsMartin Schwenke2013-07-051-1/+1
| | | | | | | | | | | | | | | | | | | | The "init" event only really fails in the scripts, which should log something useful on failure. Therefore, a core dump isn't terribly useful and sometimes attracts unwanted attention. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 3af2d833b63af9931792106db71797f3692669a8)
| * util: New function ctdb_die()Martin Schwenke2013-07-052-0/+10
| | | | | | | | | | | | | | | | | | This is like ctdb_fatal() but exits cleanly without dumping core or generating a backtrace. Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit c0a9456692c88a7a5542cd893d8f326524d3f94e)
| * eventscripts: When replaying monitor status, don't log empty outputMartin Schwenke2013-07-051-1/+3
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit ce04f1c107b4392ca955d9f29b93aaaae62439ce)
| * ctdbd: Release IP callback should fail if the IP is still hostedMartin Schwenke2013-07-051-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At the moment there (at least) are 2 bugs that cause rogue IPs: * A race where release_ip_callback() runs after a "subsequent" take IP has completed. The IP is back on an interface but we unset vnn->iface in the callback. * A "releaseip" eventscript times out. We ignore the timeout and call it success, deleting the VNN even if the IP is still hosted. We could decide not to ignore the timeout and ban the node, but killing TCP connections can take a long time and that might result in a lot of manning. We probably won't reinstate banning on "releaseip" until killing TCP connections has been optimised. In both cases, a rogue IP can be avoided by leaving vnn->iface set and simply failing the control. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c5797f2942e83da24df548ea07196fbbac0eab20)
| * ctdbd: Log warnings in release IP when unexpected interface is encounteredMartin Schwenke2013-07-051-0/+15
| | | | | | | | | | | | | | | | | | | | Previous code changes work around a potential problems but do not provide useful information when the a problem occurs. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit f1f1b0c24b9b6cd24b83a4e4da16e179287ec6ac)
| * ping_pong: Validate num_locks argument > 0Amitay Isaacs2013-07-041-0/+4
| | | | | | | | | | | | | | | | This fixes the floating point error if num_locks = 0. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 16afe36de52561a62372c14b567683dc898369d5)
| * tests: If connection to ctdb daemon fails, exitAmitay Isaacs2013-07-046-0/+18
| | | | | | | | | | | | | | | | | | This fixes the segmentation error if any of the test code fails to connect to CTDB daemon. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit d48eecd748830598f4f080952f2bf05d6f92738c)
| * build: Fix compiler warnings for uninitialized variablesAmitay Isaacs2013-07-043-2/+3
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 5408c5c4050539e5aa06a5e82ceb63a6cb5cef0c)
| * recoverd: Send the result from child process only onceAmitay Isaacs2013-07-041-1/+0
| | | | | | | | | | | | | | | | | | The result has been sent before the child keeps waiting for parent ctdbd process. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 9aa13bcedd83d463c871e3cf1f3a65da3cd83992)
| * packaging: Enable compiler optimizationsAmitay Isaacs2013-07-041-1/+1
| | | | | | | | | | | | | | | | | | | | This reverts d09570c70551aa40390ce9ceffe7bc234e1afafe. ... hoping the segv has been found in last 6 years. :-) Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 9b529189f8456fad7868fc154ae27a6fd87e93b3)
| * packaging: Allow building RPMs with system tdb/talloc/teventAmitay Isaacs2013-07-041-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | To build CTDB RPMs with system installed libraries, use following command: ./packaging/RPM/makerpms.sh \ --with system_talloc \ --with system_tdb \ --with system_tevent Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit bb54f3924ff19cd089b0a166fe8368db162ad709)
| * packaging: Do not mark /etc/ctdb/functions as configuration fileAmitay Isaacs2013-07-041-1/+1
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 1b0faae9c939a2f8da3cacba715ca62a5830d190)
| * packaging: Install README.notify.d using %doc directiveAmitay Isaacs2013-07-042-3/+2
| | | | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 53d34eb2f9e5434dea4e7182b6af566a3a96a368)
| * packaging: Install docs using %doc directiveAmitay Isaacs2013-07-042-19/+6
| | | | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 6fe584d05543eebd24abd19bab502dc4da04e921)
| * packaging: Remove ctdb_transaction from docdirAmitay Isaacs2013-07-041-4/+0
| | | | | | | | | | | | | | | | It's bundled in ctdb-tests package. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 7e53fbf92b6dd5211d918ea0e23126b7dfa50c42)
| * doc: Add a disclaimer for the EnableBans tunableMartin Schwenke2013-07-041-1/+2
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit 145b1966c1b34f1667a175235e1df2741294391c)
| * doc: Add banning bug fixes to NEWSMartin Schwenke2013-07-041-1/+8
| | | | | | | | | | | | Signed-off-by: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit b4c06e8ec8b227c1e6c01444038c3b15b5f9e606)
| * ctdbd: Don't ban self if init or shutdown event failsAmitay Isaacs2013-07-021-1/+5
| | | | | | | | | | | | | | | | | | There is no point in banning the node if init or shutdown event times out since it's going to quit anyway. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ef1c4e99ca66e7a990bc557f34abb624c315e6ba)
| * doc: The second half of monitoring is only for recovery masterAmitay Isaacs2013-07-021-2/+2
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit fcd5e1f04c5fe6c98399429b8f0918b8779acba6)
| * recoverd: when the recmaster is banned, use that information when forcing an ↵Michael Adam2013-07-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | election When we trigger an election because the recmaster considers itself inactive, update our local nodemap with the recmaster's flags before calling force_election(). This way, we don't send the inactive node freeze commands (e.g.) that may fail and then lead to ourselves getting banned. The theory is that this should help avoiding banning loops. Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 932360992b08a5483d90c0590218ba0fd756119e)
| * recoverd: fix a comment typoMichael Adam2013-07-021-1/+1
| | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit 741944f118e98f178b860194eecb215180949d18)
| * recoverd: fix a comment in main_loopMichael Adam2013-07-021-3/+3
| | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit ac06c46e4a80c635f6094b5ac6f0bf3e3a02db95)
| * recoverd: eliminate some trailing spaces from ctdb_election_win()Michael Adam2013-07-021-2/+2
| | | | | | | | | | | | Signed-off-by: Michael Adam <obnox@samba.org> (This used to be ctdb commit df30c0a05ed908fc2a997c56ff5484736b23b70f)
| * recoverd: Don't continue if the current node gets bannedMartin Schwenke2013-07-021-4/+19
| | | | | | | | | | | | | | | | | | Can not continue with recovery or monitoring cluster. Signed-off-by: Martin Schwenke <martin@meltin.net> Pair-programmed-with: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a)
| * recoverd: Refactor code to ban misbehaving nodesAmitay Isaacs2013-07-021-37/+26
| | | | | | | | | | | | | | | | | | | | Since we have nodemap information, there is no need to hardcode the limit of 20. Signed-off-by: Amitay Isaacs <amitay@gmail.com> Pair-Programmed-With: Martin Schwenke <martin@meltin.net> (This used to be ctdb commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe)
| * recoverd: Move code to ban other nodes after we get local node flagsAmitay Isaacs2013-07-021-22/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a node gets banned first, then it should not ban other nodes. This code was moved up in main_loop to avoid waiting for nodemap from other nodes (commit 83b0261f2cb453195b86f547d360400103a8b795). To prevent a banned node from banning other nodes, we need to first get nodemap information from local node, so trying to ban other nodes can fail if we are already banned. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit ae1693905036ecdbc4594fde1f12500faae4a554)
| * recoverd: Delay the initial election if node is started in stopped stateAmitay Isaacs2013-07-021-22/+26
| | | | | | | | | | | | | | | | | | Since there is an early exit if a node is stopped or banned, we can wait till the node becomes active to start initial election. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 593a17678fbd3109e118154b034d43b852659518)
| * recoverd: Update capabilities only if the current node is activeAmitay Isaacs2013-07-021-7/+7
| | | | | | | | | | | | | | | | | | | | Since we do an early return if a node is stopped or banned, move update capabilities code below the early return and just before we check the capabilities of current recovery master. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 93bcb6617e1024f810533e12390a572f51703ca0)
| * recoverd: No need to check if node is recovery master when inactiveAmitay Isaacs2013-07-021-9/+0
| | | | | | | | | | | | | | | | | | | | If a node is stopped or banned, it will cause early return from the main_loop, so this check is redundent. The election will called by an active node. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 815ddd3341b7e9db39e05a3a3fcd9a1420f053bc)
| * recoverd: Always do an early exit from main_loop if node is stopped or bannedAmitay Isaacs2013-07-021-11/+8
| | | | | | | | | | | | | | | | | | A stopped or banned node cannot do anything useful. So do not participate in any cluster activity and do not cause any unnecessary network traffic. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 2396981c4bcf30530aeb7f4395093cc202105b50)
| * recoverd: Do not set banning credits on a node if current node is inactiveAmitay Isaacs2013-07-021-0/+6
| | | | | | | | | | | | | | | | | | | | If the current node is banned or stopped, then it should not assign banning credits to other nodes since the current node will not have up-to-date flags of other nodes. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 38304f88e0c634e97d4687c25adef975f71537b8)
| * banning: Do not come out of ban if databases are not frozenAmitay Isaacs2013-07-021-0/+15
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit a60f228f8380f222f838eb619d2ab55f96f11ac2)
| * banning: No need to check if banned pnn is for local nodeAmitay Isaacs2013-07-021-3/+1
| | | | | | | | | | | | | | | | | | If the banned pnn is not the local node, the function returns early. So no need for additional check. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 297d93cecc3c0655e72ecac38508e113bdbeab9c)
| * banning: Make ctdb_local_node_got_banned() a void functionAmitay Isaacs2013-07-023-6/+4
| | | | | | | | | | | | | | | | When this function is called, we are already committed to banning and there is no point in failing this function. In case, freezing of databases fails, it will be fixed from recovery daemon. (This used to be ctdb commit bb178338658b4ae32382a1f62f7c21cee1d4878f)
| * recoverd: Also check if current node is in recovery when it is bannedAmitay Isaacs2013-07-021-6/+6
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8)
| * recoverd: Set node_flags information as soon as we get nodemapAmitay Isaacs2013-07-021-3/+3
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 8d622660a14c929e365d306147b378ea6ab92175)
| * recovered: Remove old comment as the code corresponding to that has gone awayAmitay Isaacs2013-07-021-4/+0
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 34af2cdf686d5d77854cbaa7bbcd8f878e9171c7)
| * banning: Log ban state changes for other nodes at higher debug levelAmitay Isaacs2013-07-021-3/+7
| | | | | | | | | | | | Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit c6f8407648abb37f2ed781afa5171dad8c9f59e9)
| * freeze: Make ctdb_start_freeze() a void functionAmitay Isaacs2013-07-023-18/+8
| | | | | | | | | | | | | | | | | | If this function fails due to memory errors, there is no way to recover. The best course of action is to abort. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 46efe7a886f8c4c56f19536adc98a73c22db906a)
| * freeze: If priority is invalid here, it's time to abortAmitay Isaacs2013-07-021-6/+1
| | | | | | | | | | | | | | | | | | | | | | ctdb_start_freeze() is called from ctdb_control_freeze() which fixes the priority if it's 0 and return error if it's invalid. Other callers of ctdb_start_freeze() are internal to CTDB. So if priority is invalid in ctdb_start_freeze(), definitely something is seriously wrong. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 87716e8f504d659515d3dbcf93badbf106873bc8)
| * freeze: Log message from ctdb_start_freeze() and ctdb_control_freeze()Amitay Isaacs2013-07-021-2/+3
| | | | | | | | | | | | | | | | | | | | | | This ensures that whenever databases are frozen either via sending control or by calling ctdb_start_freeze(), the action is logged. Since ctdb_control_freeze() calls ctdb_start_freeze(), move logging of message in early return condition if databases are already frozen. Signed-off-by: Amitay Isaacs <amitay@gmail.com> (This used to be ctdb commit 478e24bceda3fedfba54ccb48faa115df726b819)