diff options
| author | Martin Schwenke <martin@meltin.net> | 2013-12-09 15:54:52 +1100 |
|---|---|---|
| committer | Amitay Isaacs <amitay@samba.org> | 2013-12-17 06:32:35 +0100 |
| commit | fdccaab2a9a1b9d7eebcd7a4d121dbf68ea48dcd (patch) | |
| tree | ba45b0e858a0d43b8254d58e5802dd7a61ef9b02 | |
| parent | 970a6efa3b5fc11aa4ff79049738bb971a129a62 (diff) | |
| download | samba-fdccaab2a9a1b9d7eebcd7a4d121dbf68ea48dcd.tar.gz samba-fdccaab2a9a1b9d7eebcd7a4d121dbf68ea48dcd.tar.xz samba-fdccaab2a9a1b9d7eebcd7a4d121dbf68ea48dcd.zip | |
ctdb/eventscripts: Do not reconfigure in "monitor" events
"monitor" events can be cancelled. If a reconfigure action does a
service restart then the "monitor" event can be cancelled at the
inconvenient moment after the service is stopped. In this case the
service stays down and the node may become unhealthy (depending on
whether there are any repair actions in the monitor event).
A long time ago we did service reconfiguration in "monitor" events
following failovers. Service reconfiguration was then moved to the
"ipreallocated" event. However, reconfiguration in "monitor" events
has been kept as a last resort in case an "ipreallocate" event does
not occur. The only important case that this covers is "ctdb
deleteip", where "releaseip" events are generated without a
corresponding "ipreallocated". Therefore, IPs can be deleted without
running the required service reconfiguration.
The supported way of removing IP addresses is now via "ctdb
reloadips", which always causes a takeover run with a corresponding
"ipreallocate" event.
This means that service reconfiguration in "monitor" events is no
longer required and should be removed because it is unsafe.
Also update the associated tests. Make the first confirm that the
monitor event no longer does reconfiguration. Change the others to
test that monitor status is correctly replayed when something else is
doing a reconfigure and currently holds the reconfigure lock.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Tue Dec 17 06:32:35 CET 2013 on sn-devel-104
| -rwxr-xr-x | ctdb/config/functions | 10 | ||||
| -rwxr-xr-x | ctdb/tests/eventscripts/60.nfs.multi.002.sh | 10 | ||||
| -rwxr-xr-x | ctdb/tests/eventscripts/60.nfs.multi.003.sh | 5 | ||||
| -rwxr-xr-x | ctdb/tests/eventscripts/60.nfs.multi.004.sh | 5 | ||||
| -rwxr-xr-x | ctdb/tests/eventscripts/60.nfs.multi.005.sh | 5 |
5 files changed, 11 insertions, 24 deletions
diff --git a/ctdb/config/functions b/ctdb/config/functions index aa31f89103..4430d866bf 100755 --- a/ctdb/config/functions +++ b/ctdb/config/functions @@ -1195,16 +1195,6 @@ ctdb_service_check_reconfigure () ctdb_service_reconfigure fi ;; - monitor) - if ctdb_service_needs_reconfigure ; then - ctdb_service_reconfigure - # Given that the reconfigure might not have - # resulted in the service being stable yet, we - # replay the previous status since that's the best - # information we have. - ctdb_replay_monitor_status - fi - ;; esac else # Somebody else is running an event we don't want to collide diff --git a/ctdb/tests/eventscripts/60.nfs.multi.002.sh b/ctdb/tests/eventscripts/60.nfs.multi.002.sh index 350c1bc726..29386c13b2 100755 --- a/ctdb/tests/eventscripts/60.nfs.multi.002.sh +++ b/ctdb/tests/eventscripts/60.nfs.multi.002.sh @@ -2,7 +2,7 @@ . "${TEST_SCRIPTS_DIR}/unit.sh" -define_test "takeip, monitor -> reconfigure" +define_test "takeip, monitor -> no reconfigure" setup_nfs @@ -12,12 +12,6 @@ ok_null simple_test_event "takeip" $public_address -# This currently assumes that ctdb scriptstatus will always return a -# good status (when replaying). That should change and we will need -# to split this into 2 tests. -ok <<EOF -Reconfiguring service "nfs"... -Replaying previous status for this script due to reconfigure... -EOF +ok_null simple_test_event "monitor" diff --git a/ctdb/tests/eventscripts/60.nfs.multi.003.sh b/ctdb/tests/eventscripts/60.nfs.multi.003.sh index 68f45ab15d..653dece07a 100755 --- a/ctdb/tests/eventscripts/60.nfs.multi.003.sh +++ b/ctdb/tests/eventscripts/60.nfs.multi.003.sh @@ -2,7 +2,7 @@ . "${TEST_SCRIPTS_DIR}/unit.sh" -define_test "takeip, monitor -> reconfigure, replay error" +define_test "takeip, take reconfigure lock, monitor -> replay error" setup_nfs @@ -16,8 +16,9 @@ simple_test_event "takeip" $public_address ctdb_fake_scriptstatus 1 "ERROR" "$err" +eventscript_call ctdb_reconfigure_try_lock + required_result 1 <<EOF -Reconfiguring service "nfs"... Replaying previous status for this script due to reconfigure... $err EOF diff --git a/ctdb/tests/eventscripts/60.nfs.multi.004.sh b/ctdb/tests/eventscripts/60.nfs.multi.004.sh index b071ec8bd9..43323cf61f 100755 --- a/ctdb/tests/eventscripts/60.nfs.multi.004.sh +++ b/ctdb/tests/eventscripts/60.nfs.multi.004.sh @@ -2,7 +2,7 @@ . "${TEST_SCRIPTS_DIR}/unit.sh" -define_test "takeip, monitor -> reconfigure, replay timedout" +define_test "takeip, take reconfigure lock, monitor -> reconfigure, replay timedout" setup_nfs @@ -16,8 +16,9 @@ simple_test_event "takeip" $public_address ctdb_fake_scriptstatus -62 "TIMEDOUT" "$err" +eventscript_call ctdb_reconfigure_try_lock + required_result 1 <<EOF -Reconfiguring service "nfs"... Replaying previous status for this script due to reconfigure... [Replay of TIMEDOUT scriptstatus - note incorrect return code.] $err EOF diff --git a/ctdb/tests/eventscripts/60.nfs.multi.005.sh b/ctdb/tests/eventscripts/60.nfs.multi.005.sh index 82802aa01e..9816bec838 100755 --- a/ctdb/tests/eventscripts/60.nfs.multi.005.sh +++ b/ctdb/tests/eventscripts/60.nfs.multi.005.sh @@ -2,7 +2,7 @@ . "${TEST_SCRIPTS_DIR}/unit.sh" -define_test "takeip, monitor -> reconfigure, replay disabled" +define_test "takeip, take reconfigure lock, monitor -> reconfigure, replay disabled" setup_nfs @@ -16,8 +16,9 @@ simple_test_event "takeip" $public_address ctdb_fake_scriptstatus -8 "DISABLED" "$err" +eventscript_call ctdb_reconfigure_try_lock + ok <<EOF -Reconfiguring service "nfs"... Replaying previous status for this script due to reconfigure... [Replay of DISABLED scriptstatus - note incorrect return code.] $err EOF |
