| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
| |
Bug Description: crash in reliab15 test
Reviewed by: nkinder (Thanks!)
Fix Description: My earlier fix was for the case where the result reader thread disconnects. But it looks like there is still a problem if the update sender thread disconnects out from under the reader thread. We need to use conn_connected() to test to see if the connection is connected before we attempt to access conn->ld in the result reader thread. I also improved the error messages so that I could tell if the errors were coming from the update sender thread or the result reader thread.
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: crash in reliab15 test
Reviewed by: nhosoi (Thanks!)
Fix Description: I could not reproduce the crash, but I think the problem is that the server is not handling the disconnection case correctly. It seems that in the event of disconnection (LDAP_SERVER_DOWN 81 - Can't contact server) the code would continue to read results.
repl5_inc_result_threadmain() will call conn_read_result_ex() in a loop. If conn_read_result_ex() detects a disconnection or an unrecoverable error, it will call conn_disconnect to close the connection, and return CONN_NOT_CONNECTED. Once this happens, the code must not use conn->ld any more. However, the code did not differentiate between the not connected case and other errors, so it would keep trying to read results (in case some errors are recoverable, the thread still has to read all of the pending results). The code has been fixed to handle disconnect cases specially. I also added some additional locking to make sure the result and the abort flags were set/read correctly. Finally, I changed the code that waits for results to come in, so that if the connection has been closed, it will just return immediately.
Platforms tested: RHEL5 x86_64
Flag Day: no
Doc impact: no
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: slapd crashes after changelog is moved
Reviewed by: nkinder, nhosoi (Thanks!)
Fix Description: There are a number of real fixes, mixed in with many changes for debugging and instrumentation.
1) When the update thread gets the changelog iterator, it will use _cl5AddThread to increment the count of threads holding an open handle to the changelog. When it releases the iterator, or if there were some error acquiring the database handle, it will decrement the thread count. The way it used to work was that it would increment the thread count when retrieving the DB object, but then would immediately decrement it, meaning it had an open handle to the database, but there was no way for the changelog code to know that (except via the reference count on the DB object itself).
2) Changed the AddThread code to increment the thread count outside of the state lock - this better fits the semantics of the other uses of threadcount which are outside of the lock.
3) The changelog code that closes the databases was not closing things down in the correct order. The first thing it must do is wait for all threads with open database handles or otherwise accessing the database to terminate. Once that is done, it can call _cl5DBClose() to actually close all of the databases. Otherwise, a race condition could cause a database to be accessed after it has been closed.
4) Added clcache cleanup code, and made it possible to re-init the clcache. The clcache was not designed to be dynamically closed and opened.
clcache is init-ed in _cl5Open
clcache_init is re-entrant
Added more code to clean up the clcache
Delete the clcache in _cl5Delete
5) The clcache stores the current buffer in a thread private storage area. If the clcache has been re-initialized, this buffer is also invalid and the clcache code must get a new buffer.
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
|
|
|
|
|
| |
Description: slapd hang during cs80 cloning setup.
Fix Description: Not exactly related to the bug, but Noriko found a couple of places during investigation of internal add operations where the Slapi_Entry* could be leaked upon error. These fixes ensure that the entry is properly freed in case of error.
|
|
|
|
|
|
|
|
|
| |
Bug Description: slapd hang during cs80 cloning setup.
Reviewed by: nhosoi (Thanks!)
Fix Description: If replication code attempts to add the RUV entry during replica configuration, and the add operation returns an error, the code will attempt to free the entry. This causes a double free. Internal add operations always consume and free the entry, success or failure. The solution is to set the entry to NULL just after adding it so the clean up code will not be able to free it again.
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
|
|
|
|
|
|
|
|
|
| |
Bug Description: DS crash when modify entry that does not exist in AD
Reviewed by: nkinder (Thanks!)
Fix Description: The function that checks to see if the mod has already been made to the AD entry should just return 0 if the AD entry does not exist or could not be found - in this case, the regular windows replay code will handle it.
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
|
|
|
|
|
|
|
|
|
| |
Bug Description: Configuring Server to Server GSSAPI over SSL - Need better Error Message
Reviewed by: nkinder (Thanks!)
Fix Description: If the user attempts to set the bind mech to GSSAPI, and a secure transport is being used, the server will return LDAP_UNWILLING_TO_PERFORM and provide a useful error message. Same if GSSAPI is being used and the user attempts to use a secure transport.
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
|
|
|
|
|
|
|
|
|
| |
Bug Description: Removing Group Member in ADS and Send and Receive Updates Crashes the Directory Server
Reviewed by: nkinder (Thanks!)
Fix Description: I broke this with my earlier fix about sending mods to AD. There are calls which reset the raw entry from AD before the call to mod_already_made. The fix is to only retrieve the raw entry just before we use it, after it may have been reset. I also found a memory leak in the mod init with valueset function I added for the prior fix.
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
|
|
|
|
|
|
|
|
|
| |
Bug Description: DirSync interval should be configurable
Reviewed by: nhosoi (Thanks!)
Fix Description: Added a new config attribute - winSyncInterval - this is how often to run the dirsync search, in seconds. The default is 600 (5 minutes) which was the old hard coded value. Due to the way it's coded, the change only takes effect when the agreement is created or restarted, so the value cannot really be dynamically changed.
Platforms tested: RHEL5
Flag Day: no
Doc impact: yes - document the new attribute
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: WinSync ignores entry if NT attributes are added later.
Reviewed by: nkinder (Thanks!)
Fix Description: If we are replaying a modify operation, we need to check if the ntUser objectclass is being added along with the other attributes that tell the sync service to sync this entry. If the objectclass is being added or replaced, we check the existing entry to see if it is still a sync-able entry. If it is, we call process_replay_add to add the entry. I changed this function to accept a Slapi_Entry to add rather than the operation structure. Finally, I had to change the way we send the Account Control flags to take into account an entry that may have been added as a result of a modify operation.
I fixed a memory leak when setting the Slapi_Attr attribute type, and cleaned up a compiler warning.
NOTE: There will be no clear text password to send (unless the userPassword was modified in the same modify operation). This means the account will be added to Windows, and will be enabled, but will be essentially unusable - the user cannot login - until either the user modifies the password on the directory server side, or the administrator resets the password.
Platforms tested: RHEL5
Flag Day: no
Doc impact: yes - we will have to document the new winsync behavior
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: winsync doesn't recognize some changes
Reviewed by: nkinder (Thanks!)
Fix Description: Before sending updates to AD, first check to see if the updates still apply. For modify/add operations, check to make sure the value to add doesn't exist. If it does, remove it from the list of values in the mod. If all values are removed, then just skip the modify/add op altogether. For modify/del ops, check to see if the attribute exists. If not, just skip the op. If it does exist, check to see if the values exist, and remove the values from the mod/del op that do not exist anymore. If all values have been removed, just skip the mod/del op.
I added a new slapi function - slapi_mod_init_valueset_byval - which will init a Slapi_Mod and init the list of values using a valueset. Fortunately there was already a function for converting a Slapi_Value** to a berval**.
I also fixed a few compiler warnings.
Platforms tested: RHEL5
Flag Day: no
Doc impact: yes - add new function to slapi docs
|
|
|
|
| |
Summary: Optimized fetching of remote entry when checking if a rename is needed with winsync.
|
|
|
|
| |
Summary: Add support for synchronizing the cn attribute between DS and AD.
|
|
|
|
|
|
|
|
|
| |
Bug Description: rhds accounts are disabled in ad after full sync
Reviewed by: nkinder (Thanks!)
Fix Description: The incremental sync code calls send_accountcontrol_modify after adding an entry, but the total update code does not. I modified the code to do that. I also changed the send_accountcontrol_modify to force the account to be enabled if adding it. I tried just adding userAccountContro:512 to the default user add template, but AD does not like this - gives operations error. So you have to modify userAccountControl after adding the entry. I also cleaned up a couple of minor memory leaks.
Platforms tested: RHEL5
Flag Day: no
Doc impact: Yes - we need to document the fact that new accounts will now be created in AD enabled
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: rhds80 seg fault - pass sync - entry missing userPassword ?
Reviewed by: nkinder (Thanks!)
Fix Description: The fix is pretty obvious - just make sure we don't deref a NULL. The reason for the NULL is due to a sequence of more than one modify for the userPassword attribute, where one of the mods is a replace with no value or a delete of the attribute. The bug has the details about how to reproduce. One thing I don't know is what client is generating this sequence of operations . . .
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
QA impact: should be covered by regular nightly and manual testing
New Tests integrated into TET: none
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: Need to address 64-bit compiler warnings - again
Reviewed by: nhosoi (Thanks!)
Fix Description: This patch cleans up most of the other remaining compiler warnings. I compiled the directory server code with these flags on RHEL5 x86_64: -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
I also enabled argument/format match checking for most of the commonly used varadic functions. Most of the problems I found fell into these categories:
1) Too many or not enough arguments e.g. most everything that uses or did use LDAPDebug had extra 0,0 arguments. If they had been switched to use slapi_log_error, I removed the extra arguments - for those places still using LDAPDebug, I introduced more macros to handle the number of arguments, since C macros cannot be varadic.
2) When using NSPR formatting functions, we have to use %llu or %lld for 64-bit values, even on 64-bit systems. However, for regular system formatting functions, we have to use %ld or %lu. I introduced two new macros NSPRIu64 and NSPRI64 to handle cases where we are passing explicit 64-bit values to NSPR formatting functions, so that we can use the regular PRIu64 and PRI64 macros for regular system formatting functions. I also made sure we used NSPRI* only with NSPR functions, and used PRI* only with system functions.
3) use %lu for size_t and %ld for time_t
I did find a few "real" errors, places that the code was doing something definitely not right:
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/plugins/acl/aclinit.c_sec4
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/plugins/acl/acllas.c_sec17
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/plugins/http/http_impl.c_sec1
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/plugins/memberof/memberof.c_sec1
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/plugins/pam_passthru/pam_ptimpl.c_sec1
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_api.c_sec5
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/plugins/replication/cl5_clcache.c_sec2
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/plugins/replication/replutil.c_sec1
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/slapd/libglobs.c_sec1
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/slapd/back-ldbm/dbverify.c_sec2
https://bugzilla.redhat.com/attachment.cgi?id=325774&action=diff#ldapserver/ldap/servers/slapd/back-ldbm/ldif2ldbm.c_sec3
This is why it's important to use this compiler checking, and why it's important to fix compiler warnings, if for no other reason than the sheer noise from so many warnings can mask real errors.
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
|
|
|
|
|
|
|
| |
Summary: schema replication op error logs wrong error
Description:
As suggested by Ulf in his original comment, put break in the case
CONN_OPERATION_FAILED and set the macro to return_value for the readability.
|
|
|
|
|
|
|
|
|
|
| |
Summary: Replica crashes in the consumer initialization if the backend to be
replicated does not exist
Description:
. mapping_tree.c: if NULL mapping tree state is passed, return an error.
. repl_extop.c: if mapping tree node state is NULL, don't reset the mapping
tree state.
. replutil.c: if NULL mapping tree state is passed, log it and return.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: Support server-to-server SASL - console chaining, server cleanup
Reviewed by: nkinder (Thanks!)
Fix Description: There are two sets of diffs here. The first set adds tls, gssapi, and digest to the chaining database (aka database link) panels in the console. I had to add support for revert to some of the code to make the Reset button work without having to retrieve the values from the server each time. We already store the original values locally in the _origModel - I added code to allow the use of that in the Reset button.
The second set of diffs is for the server.
1) I had to add support for "SIMPLE" for bindMechanism - this translates to LDAP_SASL_SIMPLE for the actual mechanism. This value is NULL, so I had to add handling for NULL values in the cb config code (slapi_ch_* work fine with NULL values).
2) Added some more debugging/tracing code
3) The server to server SSL code would only work if the server were configured to be an SSL server. But for the server to be an SSL client, it only needs NSS initialized and to have the CA cert. It also needs to configured some of the SSL settings and install the correct policy. I changed the server code to do this.
Platforms tested: RHEL5
Flag Day: no
Doc impact: Yes
|
|
|
|
|
|
|
|
|
|
| |
Summary: memory leaks after db "get" deadlocks, e.g. in CL5 trim
Description: Even if cursor->c_get returns non SUCCESS(==0), there is an
occasion that DBT data holds memory which is allocated in libdb. To release
the memory, put
slapi_ch_free ((void **)&key.data);
slapi_ch_free ((void **)&data.data);
just after the while loop, where we come to the point when cursor->c_get fails.
|
|
|
|
| |
Summary: Clean-up leftover changelog semaphore at startup.
|
|
|
|
| |
Summary: Made replica_set_updatedn detect value add modify operations properly.
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: Support server-to-server SASL - part 4 - pta, winsync
Reviewed by: nhosoi (Thanks!)
Fix Description: Allow pass through auth (PTA) to use starttls. PTA uses the old style argv config params, so I just added an optional starttls (0, 1) to the end of the list, since there is currently no way to encode the startTLS extop in the LDAP URL. NOTE: adding support for true pass through auth for sasl or external cert auth will require a lot of work - not sure it's worth it - anyone other than console users can use chaining backend instead.
For windows sync, I just ported the same slapi_ldap_init/slapi_ldap_bind changes made to regular replication to the windows specific code. The Windows code still needs the do_simple_bind function to check the windows password, but it is not used for server to server bind anymore. NOTE: Windows does support startTLS, but I did not test the SASL mechanisms with Windows.
Platforms tested: Fedora 9
Flag Day: no
Doc impact: yes
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: Support server-to-server SASL - part 2
Reviewed by: nhosoi (Thanks!)
Fix Description: This part focuses on chaining backend - allowing the mux server to use SASL to connect to the farm server, and allowing SASL authentication to chain. I had to add two new config parameters for chaining:
nsUseStartTLS - on or off - tell connection to use startTLS - default is off
nsBindMechanism - if absent, will just use simple auth. If present, this must be one of the supported mechanisms (EXTERNAL, GSSAPI, DIGEST-MD5) - default is absent (simple bind)
The chaining code uses a timeout, so I had to add a timeout to slapi_ldap_bind, and correct the replication code to pass in a NULL for the timeout parameter.
Fixed a bug in the starttls code in slapi_ldap_init_ext.
The sasl code uses an internal search to find the entry corresponding to the sasl user id. This search could not be chained due to the way it was coded. So I added a new chainable component called cn=sasl and changed the sasl internal search code to use this component ID. This allows the sasl code to work with a chained backend. In order to use chaining with sasl, this component must be set in the chaining configuration nsActiveChainingComponents. I also discovered that password policy must be configured too, in order for the sasl code to determine if the account is locked out.
I fixed a bug in the sasl mapping debug trace code.
Still to come - sasl mappings to work with all of this new code - kerberos code improvements - changes to pta and dna
Platforms tested: Fedora 8, Fedora 9
Flag Day: yes
Doc impact: yes
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: Support server-to-server SASL - part 1
Reviewed by: nkinder, nhosoi, ssorce (Thanks!)
Fix Description: I've created two new functions to handle the client side of LDAP in the server - slapi_ldap_init_ext and slapi_ldap_bind. These two functions are designed to work with any connection type (ldap, ldaps, ldap+starttls, and eventually ldapi) and bind type (plain, sasl, client cert). The secure flag has been extended to use a value of 2 to mean use startTLS. One tricky part is that there is no place to store the startTLS flag in init to pass to bind, so we store that in the clientcontrols field which is currently unused. We do that because the semantics of ldap_init are not to do any network traffic, but defer that until the bind operation (or whatever the first actual operation is e.g. start_tls). I plan to replace all of the places in the code that do ldap init and bind with these functions.
I started with replication. I extended the transport to add tls for startTLS and the bind method to add sasl/gssapi and sasl/digest-md5. I removed a lot of code from repl5_connection that is now done with just slapi_ldap_init_ext and slapi_ldap_bind. One tricky part of the replication code is that it polls the connection for write available, using some ldap sdk internals. I had to fix that code to work within the public ldap api since nspr and sasl muck with the internals in different incompatible ways.
Finally, there is a lot of new kerberos code in the server. The way the server does sasl/gssapi auth with its keytab is similar to the way it does client cert auth with its ssl server cert. One big difference is that the server cannot pass the kerberos identity and credentials through the ldap/sasl/gssapi layers directly. Instead, we have to create a memory credentials cache and set the environment variable to point to it. This allows the sasl/gssapi layer to grab the credentials for use with kerberos. The way the code is written, it should also allow "external" kerberos auth e.g. if someone really wants to do some script which does a periodic kinit to refresh the file based cache, that should also work.
I added some kerberos configure options. configure tries to first use krb5-config to get the compiler and linker information. If that fails, it just looks for some standard system libraries. Note that Solaris does not allow direct use of the kerberos api until Solaris 11, so most likely Solaris builds will have to use --without-kerberos (--with-kerberos is on by default).
Fixed a bug in kerberos.m4 found by nkinder.
ssorce has pointed out a few problems with my kerberos usage that will be addressed in the next patch.
Changed the log level in ldap_sasl_get_val - pointed out by nkinder
Platforms tested: Fedora 9, Fedora 8
Flag Day: yes
Doc impact: oh yes
|
|
|
|
| |
Summary: Add support for 64-bit counters (phase 1).
|
|
|
|
|
| |
Summary: Memory usage research: checking in the experimental code
See also: http://directory.fedoraproject.org/wiki/Memory_Usage_Research
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewed by: nhosoi (Thanks!)
Fix Description: The intptr_t and uintptr_t are types which are defined as integer types that are the same size as the pointer (void *) type. On the platforms we currently support, this is the same as long and unsigned long, respectively (ILP32 and LP64). However, intptr_t and uintptr_t are more portable. These can be used to assign a value passed as a void * to get an integer value, then "cast down" to an int or PRBool, and vice versa. This seems to be a common idiom in other applications where values must be passed as void *.
For the printf/scanf formats, there is a standard header called inttypes.h which defines formats to use for various 64 bit quantities, so that you don't need to figure out if you have to use %lld or %ld for a 64-bit value - you just use PRId64 which is set to the correct value. I also assumed that size_t is defined as the same size as a pointer so I used the PRIuPTR format macro for size_t.
I removed many unused variables and some unused functions.
I put parentheses around assignments in conditional expressions to tell the compiler not to complain about them.
I cleaned up some #defines that were defined more than once.
I commented out some unused goto labels.
Some of our header files shared among several source files define static variables. I made it so that those variables are not defined unless a macro is set in the source file. This avoids a lot of unused variable warnings.
I added some return values to functions that were declared as returning a value but did not return a value. In all of these cases no one was checking the return value anyway.
I put explicit parentheses around cases like this: expr || expr && expr - the && has greater precedence than the ||. The compiler complains because it wants you to make sure you mean expr || (expr && expr), not (expr || expr) && expr.
I cleaned up several places where the compiler was complaining about possible use of uninitialized variables. There are still a lot of these cases remaining.
There are a lot of warnings like this:
lib/ldaputil/certmap.c:1279: warning: dereferencing type-punned pointer will break strict-aliasing rules
These are due to our use of void ** to pass in addresses of addresses of structures. Many of these are calls to slapi_ch_free, but many are not - they are cases where we do not know what the type is going to be and may have to cast and modify the structure or pointer. I started replacing the calls to slapi_ch_free with slapi_ch_free_string, but there are many many more that need to be fixed.
The dblayer code also contains a fix for https://bugzilla.redhat.com/show_bug.cgi?id=463991 - instead of checking for dbenv->foo_handle to see if a db "feature" is enabled, instead check the flags passed to open the dbenv. This works for bdb 4.2 through bdb 4.7 and probably other releases as well.
Platforms tested: RHEL5 x86_64, Fedora 8 i386
Flag Day: no
Doc impact: no
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: The Windows Sync API should have plug-in points - part 3
Reviewed by: nkinder (Thanks!)
Fix Description: It turns out I was a little bit too aggressive in removing memory leaks, and broke outbound modify processing. I should not have freed new_dn since it is used elsewhere. There was an earlier memory leak related to the way new_dn was initialized, but that was fixed elsewhere. The real fix is this:
- slapi_sdn_free(&new_dn);
The other fixes are lots of log messages I added to help debug this problem.
Platforms tested: RHEL5
Flag Day: no
Doc impact: yes - plugin guide
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: The Windows Sync API should have plug-in points - part 2
Reviewed by: nkinder (Thanks!)
Fix Description: Some additional changes to the api
The modify callbacks were not sufficient to handle all cases. We need to have access to the DS entry. This changes the API to add the DS entry to the modify callbacks. I also had to change the handling of the userAccountControl - it cannot just overwrite the value, it must set the appropriate bit in the bit mask.
Platforms tested: RHEL5
Flag Day: no
Doc impact: yes - plugin guide
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: The Windows Sync API should have plug-in points
Reviewed by: nkinder (Thanks!)
Fix Description: Some additional changes to the api
1) added plugin points for begin update, end update, and agreement destruction
2) added debugging code to allow a regular DS to stand in for AD
3) fixed a couple of minor memory leaks
4) added the rest of the SLAPI DSE code to the public API to allow plugins to do dynamic configuration using the SLAPI public API
Platforms tested: RHEL5
Flag Day: no
Doc impact: yes - plugin guide
|
|
|
|
|
| |
Bug Description: The Windows Sync API should have plug-in points
Fix Description: forgot to add #include "winsync-plugin.h"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: The Windows Sync API should have plug-in points
Reviewed by: nkinder (Thanks!)
Fix Description: Several plug-in points have been added to the windows sync code, available to regular plug-ins that register with the winsync api via the slapi api broker interface. winsync-plugin.h documents the use of these along with some example plug-in code. The windows private data structure has been extended to add two additional fields:
raw_entry - the raw entry read from AD - this is passed to several plug-in callbacks to allow them to have access to all of the attributes and values in the entry in case further processing is needed. This required a change to the function that reads the entry, to have it save the raw entry read each time from AD, in addition to the "cooked" entry it passes back to the caller.
api_cookie - this is the plug-in private data passed back to each plug-in callback and allows the plug-in to specify some additional context
Both of these are stored in the private data field in the agreement, so some of the existing functions had to be changed to pass in the connection object or the protocol object in order to gain access to the agreement object.
There were several small memory leaks in the existing code that have been fixed - these are the places where a free() function of some sort has been added. Also the usage of slapi_sdn_init_dn_byval leaked - slapi_sdn_new_dn_byval must be used here instead - cannot mix slapi_sdn_new with slapi_sdn_init*
I also cleaned up several compiler warnings.
The slapi changes are not strictly necessary, but they provide some conveniences to the winsync code and to plug-in writers. The good thing is that they were already private functions, so mostly just needed to have public api wrappers.
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: MMR breaks with time skew errors
Reviewed by: nhosoi, nkinder (Thanks!)
Fix Description: CSN remote offset generation seems broken. We seem to accumulate a remote offset that keeps growing until we hit the limit of 1 day, then replication stops. The idea behind the remote offset is that servers may be seconds or minutes off. When replication starts, one of the itmes in the payload of the start extop is the latest CSN from the supplier. The CSN timestamp field is (sampled_time + local offset + remote offset). Sampled time comes from the time thread in the server that updates the time once per second. This allows the consumer, if also a master, to adjust its CSN generation so as not to generate duplicates or CSNs less than those from the supplier. However, the logic in csngen_adjust_time appears to be wrong:
remote_offset = remote_time - gen->state.sampled_time;
That is, remote_offset = (remote sampled_time + remote local offset + remote remote offset) - gen->state.sampled_time
It should be
remote_offset = remote_time - (sampled_time + local offset + remote offset)
Since the sampled time is not the actual current time, it may be off by 1 second. So the new remote_offset will be at least 1 second more than it should be. Since this is the same remote_offset used to generate the CSN to send back to the other master, this offset would keep increasing and increasing over time. The script attached to the bug helps measure this effect. The new code also attempts to refresh the sampled time while adjusting to make sure we have as current a sampled_time as possible. In the old code, the remote_offset is "sent" back and forth between the masters, carried along in the CSN timestamp generation. In the new code, this can happen too, but to a far less extent, and should max out at (real offset + N seconds) where N is the number of masters.
In the old code, you could only call csngen_adjust_time if you first made sure the remote timestamp >= local timestamp. I have removed this restriction and moved that logic into csngen_adjust_time. I also cleaned up the code in the consumer extop - I combined the checking of the CSN from the extop with the max CSN from the supplier RUV - now we only adjust the time once based on the max of all of these CSNs sent by the supplier.
Finally, I cleaned up the error handling in a few places that assumed all errors were time skew errors.
Follow up - I found a bug in my previous patch - _csngen_adjust_local_time must not be called when the sampled time == the current time. So I fixed that where I was calling _csngen_adjust_local_time, and I also changed _csngen_adjust_local_time so that time_diff == 0 is a no-op.
Platforms tested: RHEL5, F8, F9
Flag Day: no
Doc impact: no
QA impact: Should test MMR and use the script to measure the offset effect.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: rhds80 account accountunlocktime attribute breaks replication
Reviewed by: nhosoi (Thanks!)
Fix Description: We were not handling errors returned from the consumer correctly in the async replication code. The problem was that we were exiting the async read results thread immediately. However, we needed to wait for and read all of the outstanding responses, then exit the thread when all of them had been read. The new code handles this case correctly, allowing us to read all of the pending responses before exiting.
The flip side of this is that passwordIsGlobalPolicy only works on the _consumer_. It has no effect whatsoever on the _supplier_ side of replication. The fix for this is to configure fractional replication _always_ and to add the password policy op attrs to the list of attrs not to replicate. This should work fine with RHDS 8.0.0-14 and later.
Platforms tested: RHEL5
Flag Day: no
Doc impact: Yes. We will need to document exactly how passwordIsGlobalPolicy works and how to configure fractional replication.
QA impact: Will need to do more testing of MMR with account lockout to make sure this error does not blow up MMR anymore.
New Tests integrated into TET: Working on it.
|
|
|
|
|
|
|
|
|
| |
Bug Description: "DB_BUFFER_SMALL: User memory too small for return value" error when importing LDIF with replication active
Reviewed by: nkinder (Thanks!)
Fix Description: BDB 4.3 does not use ENOMEM if the given buffer is too small - it uses DB_BUFFER_SMALL. This fix allows us to use DB_BUFFER_SMALL in BDB 4.2 and earlier too. I also cleaned up some of the cl5 api return codes to return an appropriate error code to the higher levels rather than pass the ENOMEM up.
Platforms tested: RHEL5
Flag Day: no
Doc impact: no
|
|
|
|
| |
Summary: Allow fractional replication between masters.
|
|
|
|
| |
Summary: Fixed crash in replication during bulk import. Use bulk impport code more consistently.
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: MMR breaks from master that has been reinited
Reviewed by: nkinder (Thanks!)
Fix Description: This problem occurs when you have two or more masters, and you have updates that have originated at a master that have been sent to other masters (so that the other masters have a valid min/max csn for that replica in the ruv). If that master needs to be reinitialized for some reason (crash, etc.) the reinit will erase the changelog. The RUV for that master will now contain CSNs that are not in the changelog. If that master attempts to update another master, it will first look at the RUV from the consumer, which will contain the old CSNs, and it will look for those CSNs in the changelog, fail, and abort the update process, meaning this master can no longer send updates to other servers.
The solution is for the master to just use the min CSN in its own RUV as the new starting point, if it has not been purged. In the case of purging, if the CSN is not found, this means the consumer is too far behind and must be reinitialized.
Platforms tested: RHEL5 x86_64
Flag Day: no
Doc impact: no
|
|
|
|
| |
Summary: Solaris: warnings reported by the Solaris compiler
|
|
|
|
|
|
|
|
|
| |
Summary: MMR: Supplier does not respond anymore after many operations (deletes)
Description: introduce OP_FLAG_REPL_RUV. It's set in repl5_replica.c if the
entry is RUV. The operation should not be blocked at the backend SERIAL lock
(this is achieved by having OP_FLAG_REPL_FIXUP set in the operation flag).
But updating RUV has nothing to do with VLV, thus if the flag is set, it skips
the VLV indexing.
|
|
|
|
| |
Summary: HP-UX: warnings reported by the HP-UX compiler
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Description: Netscape Console allows instance directory to be set as change log
Reviewed by: nkinder (Thanks!)
Fix Description: 1) When removing the changelog files and directories, only remove the actual db related files - version, guardian, *db4, log.*, and __db.* - This should take care of the cases where the changelog was already created in an existing directory.
2) Disallow adding/changing a changelog db directory if it already exists and is not empty
Platforms tested: RHEL5 x86_64
Flag Day: no
Doc impact: no
QA impact: should be covered by regular nightly and manual testing
New Tests integrated into TET: none
|
|
|
|
| |
Summary: Don't add mailGroup objectclass when sync'ing new group entries from AD.
|
|
|
|
| |
Summary: Remove changelog db file when replica config is removed.
|
|
|
|
|
|
| |
Summary: miscellaneous memory leaks
Description: 1) fixed memory leaks
2) cleaned up normalize_path code with fixing memory leaks
|
|
|
|
| |
Summary: Fractional replication log statement needed a newline.
|
|
|
|
| |
Summary: Don't replay AD originated password changes back to AD.
|
|
|
|
| |
Summary: Make sync total update deal with an empty changelog.
|
|
|
|
| |
Summary: Make dbscan handle special RUV related changelog entries.
|