summaryrefslogtreecommitdiffstats
path: root/ctdb/server/ctdb_control.c
diff options
context:
space:
mode:
authorVolker Lendecke <vl@samba.org>2009-12-09 15:11:45 +0100
committerMichael Adam <obnox@samba.org>2009-12-12 00:45:39 +0100
commitf6ea3e6bcfce636d41bebb5599aa6b948b9bb884 (patch)
treecc95acf5c6deffc10cdde301f5f6e9f4c63c6971 /ctdb/server/ctdb_control.c
parentb664a86bc2a1f4f0d6f7fb32caf744effa96bdf4 (diff)
downloadsamba-f6ea3e6bcfce636d41bebb5599aa6b948b9bb884.tar.gz
samba-f6ea3e6bcfce636d41bebb5599aa6b948b9bb884.tar.xz
samba-f6ea3e6bcfce636d41bebb5599aa6b948b9bb884.zip
Make fetch_locked more scalable
This patch improves the handling of the fetch_lock operation on non-persistent databases that ctdb clients have to do very frequently. The normal flow how this goes is the following: 1. Client does a local fetch_lock on the database 2. Client looks if the local node is dmaster. If yes, everything is fine If no, continue here 3. Client unlocks the local record 4. Client issues a "get me the record" call to ctdbd 5. ctdbd goes out and fetches the dmaster role 6. ctdbd tells the client to retry 7. Client starts over again The problem is between step 6 and 7: Before the client has had the chance to retry (i.e. catch the record with a fetch_locked), another node might have come asking ctdbd to migrate away the record again. This is a real problem, I've seen >20 loops of this kind in real workloads. This patch does the following: Whenever ctdb receives a record as result of step 5, it puts the key on a "holdback list". As long as a key is on this list, a request to migrate away the dmaster is put on hold. It is the client's duty to issue the "CTDB_CONTROL_GOTIT" control when it has successfully done step 2 after having asked ctdb to fetch the record. This will release the key from the "holdback list" and re-issue all dmaster migration requests. As a safeguard against malicious clients, once a second (default 1000msecs, tunable "HoldbackCleanupInterval" in milliseconds) ctdbd goes over the list of held back keys, deletes them and releases all held back migration requests. (This used to be ctdb commit 5736e17c139c9a8049e235429aeae0c6c9d0e93d)
Diffstat (limited to 'ctdb/server/ctdb_control.c')
-rw-r--r--ctdb/server/ctdb_control.c4
1 files changed, 4 insertions, 0 deletions
diff --git a/ctdb/server/ctdb_control.c b/ctdb/server/ctdb_control.c
index 3382fae39aa..2b703e73151 100644
--- a/ctdb/server/ctdb_control.c
+++ b/ctdb/server/ctdb_control.c
@@ -281,6 +281,7 @@ static int32_t ctdb_control_dispatch(struct ctdb_context *ctdb,
case CTDB_CONTROL_SHUTDOWN:
ctdb_stop_recoverd(ctdb);
ctdb_stop_keepalive(ctdb);
+ ctdb_stop_holdback_cleanup(ctdb);
ctdb_stop_monitoring(ctdb);
ctdb_release_all_ips(ctdb);
if (ctdb->methods != NULL) {
@@ -560,6 +561,9 @@ static int32_t ctdb_control_dispatch(struct ctdb_context *ctdb,
CHECK_CONTROL_DATA_SIZE(sizeof(uint64_t));
return ctdb_control_get_db_seqnum(ctdb, indata, outdata);
+ case CTDB_CONTROL_GOTIT:
+ return ctdb_control_gotit(ctdb, indata);
+
default:
DEBUG(DEBUG_CRIT,(__location__ " Unknown CTDB control opcode %u\n", opcode));
return -1;