From:	Rich Megginson <rmeggins@redhat.com>
Sent:	Thursday, July 11, 2013 12:29 PM
To:	Unger, Chris
Cc:	Beachler, Michael D
Subject:	Re: 389 Directory Server Bug

On 07/11/2013 10:02 AM, Unger, Chris wrote:
Rich,
...

Yes, decreasing the “idlistscanlimit” helps, but it is still slower than SUN 
DS when 
using that particular filter.  We have approximately 2.2 million entries in 
our current 
directory server implementation, so my question is how low can I set the 
“idlistscanlimit”
configuration parameter before causing other performance problems 
(other than by
trial-and-error)? 
 
I don't know other than trial-and-error 
 

SUN recommended  that the “allidsthreshold” be set to 10% of the total 
number of entries in the directory server.  Is there a similar formula, method,  or tool(s) 
that you typically use to determine the optimal value for “idlistscanlimit”?
 
No, and that sounds like a very rough rule of thumb from Sun, which will only work either by 
sheer luck or with a lot of additional trial and error. 
 
Please file a ticket against 389. 
 

 
Chris
 
From: Rich Megginson [mailto:rmeggins@redhat.com]  
Sent: Wednesday, July 03, 2013 11:24 AM 
To: Unger, Chris 
Subject: Re: 389 Directory Server Bug
 
On 07/03/2013 08:58 AM, Unger, Chris wrote:
Rich,
OK, we applied those patches, but it didn’t make any difference in regards 
to our 389 DS
performance problems; however, that patch seems like a worthwhile 
feature to include
in an eventual formal release of the 389 DS software.
 
Thanks for testing them.  That's good to know. 
 
 
 

 
After some further analysis, we think we have tracked down why 389 DS is slower than
Sun DS in our environment.  The following search filter sent to Sun DS completes with
an average etime of .001 seconds, however; 389 DS takes about .02 seconds to 
perform
the same search:
 
base="ou=c3sUserProduct,ou=CAS,ou=Commerce,o=cas.org" scope=2 
filter="(&(|(objectClass=organizationalPerson)(objectClass=inetOrgPerson)(objectClass
=organization)(objectClass=organizationalUnit)(objectClass=groupOf
Names)(objectClass=groupOfUniqueNames)(objectClass=group))(c3sUserID=EndUser
0000078458))" attrs="objectClass"
 
Here is the kicker: if I reverse the AND ‘ed  (“&”) terms in that search filter where the 
more restrictive 
term (c3sUserID=EndUser0000078458) is first, then 389 DS returns the same results in 
.001 seconds,
which is as fast as Sun’s DS.  Somehow Sun’s DS is automatically optimizing that 
search regardless 
of the order of the terms used in the filter.
 
Yes, I believe SunDS did do some work on filter optimization. 
 
It may also be related to how 389 and SunDS deal with searches that are too large to be 
indexed. 
 
http://port389.org/wiki/Database_Architecture 
 
SunDS has this concept of "allids on write".  What this means is that, when it is updating the 
indexes for a write operation, if the index gets to be a certain size, SunDS will just remove the 
index.  This threshold is called the nsslapd-allidsthreshold.  I'm not sure what the default is, 
probably around 4000 or so. 
 
In your search filter, I'm assuming filters like objectClass=organizationalPerson etc. match 
thousands of entries.  SunDS will see objectClass=organizationalPerson and not even 
attempt to build an ID list from the index - it will just skip it, and skip all of the other 
objectClass=X that have no index due to reaching the allids threshold.  It will then get to 
the last filter component c3sUserID=EndUser0000078458 and use the index for that 
search, which will be extremely fast. 
 
389 has this concept of "allids on read".  What this means is that 389 will keep indexing 
everything - there is no limit on write operations.  However, when it hits a search filter 
like objectClass=organizationalPerson, it will attempt to build an ID list in memory from the 
index.  There is a parameter nsslapd-idlistscanlimit that controls how many IDs 389 will attempt 
to put into the list.  Once the limit is hit (default 4000), 389 will throw away the list and go on to 
the next filter component.  It has to do this for every filter component, which in your case, 
takes a while.  Then, it gets to the final one c3sUserID=EndUser0000078458 and builds list 
of 1 ID from the index.  In the 389 case, I'm not sure why moving 
c3sUserID=EndUser0000078458 first helps so much - maybe it sees that it has to & 
(intersect) this with the others, so it doesn't even bother since there is only the 1 ID in 
the list. 
 
So, you might be able to lower nsslapd-idlistscanlimit to a very low value to help with these 
searches.  However, this means that other searches will not use indexes.  For example, suppose 
you did a search like 
 
(location=Sunnyvale) 
 
where you have an equality index on 'location' and there are 1000 matches.  Even if you have 
1m records in the entire database, because this search is indexed, and there are 1000 matches 
(< 4000 nsslapd-idlistscanlimit), it will only build a list of 1000 IDs and only look at these.  If you 
set nsslapd-idlistscanlimit to 10, then the search for location=Sunnyvale would not be able to 
use the ID list, and would have to look through all 1m entries for matches for 
location=Sunnyvale, which would be very slow and resource intensive. 

That particular search filter format is heavily used in our environment, and is generated 
by 
vendor code, so we can’t easily change it.  For us, it is the difference in being able to do 
1000
of those types of searches per second with Sun DS, vs. 50 per second with 389 DS.
 
Chris
 
Chris W. Unger                          
UNIX Systems Administrator                   
Chemical Abstracts Service