From: Rich Megginson Sent: Thursday, July 11, 2013 12:29 PM To: Unger, Chris Cc: Beachler, Michael D Subject: Re: 389 Directory Server Bug On 07/11/2013 10:02 AM, Unger, Chris wrote: Rich, ... Yes, decreasing the “idlistscanlimit” helps, but it is still slower than SUN DS when using that particular filter. We have approximately 2.2 million entries in our current directory server implementation, so my question is how low can I set the “idlistscanlimit” configuration parameter before causing other performance problems (other than by trial-and-error)? I don't know other than trial-and-error SUN recommended that the “allidsthreshold” be set to 10% of the total number of entries in the directory server. Is there a similar formula, method, or tool(s) that you typically use to determine the optimal value for “idlistscanlimit”? No, and that sounds like a very rough rule of thumb from Sun, which will only work either by sheer luck or with a lot of additional trial and error. Please file a ticket against 389. Chris From: Rich Megginson [mailto:rmeggins@redhat.com] Sent: Wednesday, July 03, 2013 11:24 AM To: Unger, Chris Subject: Re: 389 Directory Server Bug On 07/03/2013 08:58 AM, Unger, Chris wrote: Rich, OK, we applied those patches, but it didn’t make any difference in regards to our 389 DS performance problems; however, that patch seems like a worthwhile feature to include in an eventual formal release of the 389 DS software. Thanks for testing them. That's good to know. After some further analysis, we think we have tracked down why 389 DS is slower than Sun DS in our environment. The following search filter sent to Sun DS completes with an average etime of .001 seconds, however; 389 DS takes about .02 seconds to perform the same search: base="ou=c3sUserProduct,ou=CAS,ou=Commerce,o=cas.org" scope=2 filter="(&(|(objectClass=organizationalPerson)(objectClass=inetOrgPerson)(objectClass =organization)(objectClass=organizationalUnit)(objectClass=groupOf Names)(objectClass=groupOfUniqueNames)(objectClass=group))(c3sUserID=EndUser 0000078458))" attrs="objectClass" Here is the kicker: if I reverse the AND ‘ed (“&”) terms in that search filter where the more restrictive term (c3sUserID=EndUser0000078458) is first, then 389 DS returns the same results in .001 seconds, which is as fast as Sun’s DS. Somehow Sun’s DS is automatically optimizing that search regardless of the order of the terms used in the filter. Yes, I believe SunDS did do some work on filter optimization. It may also be related to how 389 and SunDS deal with searches that are too large to be indexed. http://port389.org/wiki/Database_Architecture SunDS has this concept of "allids on write". What this means is that, when it is updating the indexes for a write operation, if the index gets to be a certain size, SunDS will just remove the index. This threshold is called the nsslapd-allidsthreshold. I'm not sure what the default is, probably around 4000 or so. In your search filter, I'm assuming filters like objectClass=organizationalPerson etc. match thousands of entries. SunDS will see objectClass=organizationalPerson and not even attempt to build an ID list from the index - it will just skip it, and skip all of the other objectClass=X that have no index due to reaching the allids threshold. It will then get to the last filter component c3sUserID=EndUser0000078458 and use the index for that search, which will be extremely fast. 389 has this concept of "allids on read". What this means is that 389 will keep indexing everything - there is no limit on write operations. However, when it hits a search filter like objectClass=organizationalPerson, it will attempt to build an ID list in memory from the index. There is a parameter nsslapd-idlistscanlimit that controls how many IDs 389 will attempt to put into the list. Once the limit is hit (default 4000), 389 will throw away the list and go on to the next filter component. It has to do this for every filter component, which in your case, takes a while. Then, it gets to the final one c3sUserID=EndUser0000078458 and builds list of 1 ID from the index. In the 389 case, I'm not sure why moving c3sUserID=EndUser0000078458 first helps so much - maybe it sees that it has to & (intersect) this with the others, so it doesn't even bother since there is only the 1 ID in the list. So, you might be able to lower nsslapd-idlistscanlimit to a very low value to help with these searches. However, this means that other searches will not use indexes. For example, suppose you did a search like (location=Sunnyvale) where you have an equality index on 'location' and there are 1000 matches. Even if you have 1m records in the entire database, because this search is indexed, and there are 1000 matches (< 4000 nsslapd-idlistscanlimit), it will only build a list of 1000 IDs and only look at these. If you set nsslapd-idlistscanlimit to 10, then the search for location=Sunnyvale would not be able to use the ID list, and would have to look through all 1m entries for matches for location=Sunnyvale, which would be very slow and resource intensive. That particular search filter format is heavily used in our environment, and is generated by vendor code, so we can’t easily change it. For us, it is the difference in being able to do 1000 of those types of searches per second with Sun DS, vs. 50 per second with 389 DS. Chris Chris W. Unger UNIX Systems Administrator Chemical Abstracts Service