diff options
| author | Noriko Hosoi <nhosoi@redhat.com> | 2010-03-05 10:07:38 -0800 |
|---|---|---|
| committer | Noriko Hosoi <nhosoi@redhat.com> | 2010-03-05 10:07:38 -0800 |
| commit | 0b95451c7e50cb6b2d0cb310dddca18336e1b2ac (patch) | |
| tree | 82cab73fc5a8326f0f4e5bd5869f154895b98532 /ldap/servers/plugins/replication/repl5_protocol.c | |
| parent | d66eb3dd9fdb9648b5058161bf8a7740a16fb2d8 (diff) | |
| download | ds-0b95451c7e50cb6b2d0cb310dddca18336e1b2ac.tar.gz ds-0b95451c7e50cb6b2d0cb310dddca18336e1b2ac.tar.xz ds-0b95451c7e50cb6b2d0cb310dddca18336e1b2ac.zip | |
570667 - MMR: simultaneous total updates on the masters cause
deadlock and data loss
https://bugzilla.redhat.com/show_bug.cgi?id=570667
Description: In the MMR topology, if a master receives a total
update request to initialize the other master and being initialized
by the other master at the same time, the 2 replication threads hang
and the replicated backend instance could be wiped out.
To prevent the server running the total update supplier and the
consumer at the same time, REPLICA_TOTAL_EXCL_SEND and _RECV bits
have been introduced. If the server is sending the total update
to other replicas, the server rejects the total update request
on the backend. But the server can send multiple total updates
to other replicas at the same time. If the total update from
other master is in progress on the server, the server rejects
another total update from yet another master as well as a request
to initialize other replicas.
Diffstat (limited to 'ldap/servers/plugins/replication/repl5_protocol.c')
| -rw-r--r-- | ldap/servers/plugins/replication/repl5_protocol.c | 28 |
1 files changed, 28 insertions, 0 deletions
diff --git a/ldap/servers/plugins/replication/repl5_protocol.c b/ldap/servers/plugins/replication/repl5_protocol.c index 927c450a..efb32716 100644 --- a/ldap/servers/plugins/replication/repl5_protocol.c +++ b/ldap/servers/plugins/replication/repl5_protocol.c @@ -317,6 +317,28 @@ prot_thread_main(void *arg) dev_debug("prot_thread_main(STATE_PERFORMING_INCREMENTAL_UPDATE): end"); break; case STATE_PERFORMING_TOTAL_UPDATE: + { + Slapi_DN *dn = agmt_get_replarea(agmt); + Replica *replica = NULL; + Object *replica_obj = replica_get_replica_from_dn(dn); + if (replica_obj) + { + replica = (Replica*) object_get_data (replica_obj); + /* If total update against this replica is in progress, + * we should not initiate the total update to other replicas. */ + if (replica_is_state_flag_set(replica, REPLICA_TOTAL_EXCL_RECV)) + { + object_release(replica_obj); + slapi_log_error(SLAPI_LOG_FATAL, repl_plugin_name, + "%s: total update on the replica is in progress. Cannot initiate the total update.\n", agmt_get_long_name(rp->agmt)); + break; + } + else + { + replica_set_state_flag (replica, REPLICA_TOTAL_EXCL_SEND, 0); + } + } + PR_Lock(rp->lock); /* stop incremental protocol if running */ @@ -332,7 +354,13 @@ prot_thread_main(void *arg) replica initialization is completed. */ agmt_replica_init_done (agmt); + if (replica_obj) + { + replica_set_state_flag (replica, REPLICA_TOTAL_EXCL_SEND, 1); + object_release(replica_obj); + } break; + } case STATE_FINISHED: dev_debug("prot_thread_main(STATE_FINISHED): exiting prot_thread_main"); done = 1; |
