| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
delete_record_traverse()
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
| |
At this point, it is >= 0 anyways.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
| |
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
CTDB tracks connections to be able to send tickle ACKs and gratuitous
ARPs. When there are no public IPs, there is no need for tickle ACKs
and gratuitous ARPs.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Mar 4 03:01:38 CET 2014 on sn-devel-104
|
|
|
|
|
|
|
|
|
|
| |
Fix suggested by by Kevin Osborn <kosborn@overlandstorage.com>.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Feb 27 13:54:59 CET 2014 on sn-devel-104
|
|
|
|
|
|
|
|
|
|
|
| |
tcp_update_flag is set to true whenever tickles are added or deleted.
This flag is used to determine whether or not to send tickles list to
other nodes. Once tickles list is sent to other nodes successfully,
set tcp_update_flag to false, so ctdbd does not keep sending same tickles
list every TickleUpdateInterval (20 seconds).
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
|
|
| |
This doesn't implement what was recommended. That would require
careful error handling, probably with a fallback to this code anyway.
This is simple and does no worse that the current code. That is, the
new node is updated on the next call to tdb_update_tcp_tickles().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
| |
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
| |
This fixes ctdb crash reported in bug #10366.
Fix suggested by Kevin Osborn <kosborn@overlandstorage.com>.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Remove unnecessary candimbl parameter.
This parameter can be cheaply calculated in
lcp2_failback_candidate(). The compiler will probably do an
excellent job optimising it. :-)
* Clarify a debug statement
This is much clearer than doing a complex recalculation of a known
value.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
| |
Currently this can be checked many times. However, there's no point
calling the rebalance/failback code at all if there are no rebalance
candidates.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
The fast vacuum run may have increased the freelist size.
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Fri Feb 14 03:15:30 CET 2014 on sn-devel-104
|
|
|
|
|
| |
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
srcimbl gets changed on every iteration of the loop. The value that
should be stored for the new imbalance of the source node is
minsrcimbl.
To help diagnose this, added some extra debug that can be left in.
The extra debug changes the output of a couple of tests. Note that
the resulting IP allocations in those tests is unchanged - only the
debug output is changed.
Also add some new tests that illustrates the bug.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If CTDB_DEUB_HUNG_SCRIPT is set, use that instead of the default
debug script. This code was dropped by mistake in commit
18c1f432102f1a5093927be9276d001180539e50.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Wed Feb 12 08:47:47 CET 2014 on sn-devel-104
|
|
|
|
|
|
|
| |
If event script does not exist or does not have execute permissions, then
return negative errno to distinguish from the exit errors of event script.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
| |
Otherwise ctdb_client_async_wait() is a no-op.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using RB tree for sorting the script names (incorrectly since
it's only using the leading numbers in the script name), use scandir
with alphasort.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Tue Jan 21 06:41:25 CET 2014 on sn-devel-104
|
|
|
|
|
|
|
|
|
|
|
|
| |
Any currently running monitor events are cancelled if any other events
are scheduled. However, this does not stop monitor events to be run
when other events are already running.
Keep track of the number of active events and schedule monitor event
only if there are no active events.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment ctdb_check_healthy() is overloaded to wait until the
first recovery is complete, handle the "startup" event and also
actually handle monitoring. This is untidy and hard to follow.
Instead, have the daemon explicitly wait for 1st recovery after the
"setup" event. When first recovery is complete, schedule a function
to handle the "startup" event. When the "startup" event succeeds then
explicitly enable monitoring.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Failure might be expected when disabling takeover runs on banned
nodes, since they might be suffering from performance problems or
similar. More broadly, administrators who reconfigure a cluster that
isn't in a happy state aren't necessarily doing something sensible.
However, allowing takeover runs to be disabled on inactive nodes stops
reconfiguration of stopped nodes. This is probaby an unreasonable
limitation, so drop it.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
| |
Currently timeouts for controls to inactive nodes can cause banning
credits to be applied. This should not happen.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
This function has been replaced with ctdb_vfork_with_logging().
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
Autobuild-User(master): Martin Schwenke <martins@samba.org>
Autobuild-Date(master): Thu Jan 16 04:05:35 CET 2014 on sn-devel-104
|
|
|
|
|
|
|
| |
Eventscripts are now executed using a helper.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
|
| |
(part 2)
Use ctdb_event_helper to run debug-hung-script.sh.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
|
| |
(part 1)
Use ctdb_event_helper to run eventscripts.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
| |
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
| |
This will be used to spawn lightweight helper processes to run
eventscripts.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
| |
This was added to support external monitoring using CTDB event scripts.
However, it was never used.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
| |
These events have never been used.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise new requests can come in during the latter parts of the
takeover run when the IP allocation algorithm has already run, and the
new requests will be dequeued even though they haven't really be
processed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
| |
Otherwise recovery ends up done by RSN when it is unnecessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a significant overhead using fork() over vfork(), specially
when the child process execs a helper. The overhead is in memory space
and time.
# strace -c ./test_fork 1024 200
count=1024, size=204800, total=200M
failed fork=0
time for fork() = 4879.597000 us
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 4.543321 3304 1375 375 clone
0.00 0.000071 0 1033 mmap
0.00 0.000000 0 1 read
0.00 0.000000 0 3 write
0.00 0.000000 0 2 open
0.00 0.000000 0 2 close
0.00 0.000000 0 3 fstat
0.00 0.000000 0 3 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00 4.543392 2429 376 total
# strace -c ./test_vfork 1024 200
count=1024, size=204800, total=200M
failed fork=0
time for fork() = 82.041000 us
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
96.47 0.001204 1 1000 vfork
3.53 0.000044 0 1033 mmap
0.00 0.000000 0 1 read
0.00 0.000000 0 3 write
0.00 0.000000 0 2 open
0.00 0.000000 0 2 close
0.00 0.000000 0 3 fstat
0.00 0.000000 0 3 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00 0.001248 2054 1 total
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
|
| |
When a child process is created for a lock request, the current locks
statistics should be updated immediately. This will provide accurate
information on number of active lock requests.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
| |
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This limit was currently a global limit and not per database. This
prevents any database freeze lock requests from getting scheduled if
the global limit was reached.
Only individual record requests should be limited and database freeze
requests should always get scheduled.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
|
| |
When running a mixed version cluster, compatibility with older
versions was was broken during recent refactorisation.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
|
|
| |
This also happens earlier in do_recovery() and the nodemap is not
updated after that, so this update is redundant.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
| |
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
|
|
|
|
|
|
| |
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit ca5fc3431573c44d55d09d987c715fb53756fc1f)
|
|
|
|
|
|
|
|
|
| |
Rebalance target nodes should be set even if a deferred rebalance is
not configured. The user can explicitly cause a takeover run.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit afd9b51644af074752d74c412cb4e7ec2eba2c69)
|
|
|
|
|
|
| |
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit 275ed9ebe287e39d891888c13810c70f347af8ac)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
at start"
This is unnecessary due to 03e2e436db5cfd29a56d13f5d2101e42389bfc94.
Furthermore, if a node doesn't force an election but wins it then it
can fail to record that it is the new recovery master. This can lead
to a reverse split brain where there is no recovery master.
This reverts commit c5035657606283d2e35bea40992505e84ca8e7be.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Conflicts:
server/ctdb_recoverd.c
(This used to be ctdb commit c8b542e059a54b8d524bd430cad9d82e5edd864d)
|
|
|
|
|
|
|
|
|
| |
This is important enough that we should see it when the log level is
DEBUG_NOTICE.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit eb8ec5681bfccb26c8ffae72952d54bb0ba46249)
|
|
|
|
|
|
|
|
|
|
|
| |
5 minutes is too long to leave the cluster in limbo if the recovery
daemon dies during a takeover run, even though this is quite unlikely.
We need a new recover master to be able to do takeover runs fairly
quickly.
This reverts commit 71080676bb4acbd0d9b595a30cf7fe6dddbf426f.
(This used to be ctdb commit 3e41170c78fc7a2bf526129c9b7db3739b61c6bf)
|
|
|
|
|
|
|
|
|
| |
Use sequence numbers to do recovery for persistent databases instead of
RSNs. This fixes the problem of registry corruption during recovery.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 56486d1c01cc8ad0e4b8cee7a22429e72e50f03d)
|
|
|
|
|
|
|
|
|
| |
Introduce CTDB_VARDIR variable that points to /var/lib/ctdb by default.
This makes CTDB_VARDIR consistent across C code and scripts.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 2c09aac71188f43cd592572b10ea30b7a2969678)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No need to check if the options are set. The options are always set
via static defaults.
No need to talloc_strdup() the values via wrapper functions. The
options aren't going away. Remove now unused ctdb_set_tdb_dir() and
similar functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit 1fe82f3d7b610547ff4945887f15dd6c5798a49b)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Defaults for ctdb->db_directory and similar variables are currently
set in 2 places.
Change this to set them in only 1 place and make the directories at
initialisation time instead of waiting until later.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(This used to be ctdb commit d73d84346488a2ed54e6a86f9d7ec641c8e33ace)
|