= Design Overview = The NIS Server plugin's aim is to serve up data from the directory server using the NIS protocols. It does this by doing what any gateway would do: it queries the directory server for entries which would correspond to the contents of maps, reads the contents of various attributes from those entries, and uses that data to synthesize entries for maps which it serves to clients. In broad strokes, one design might look like this: ┌──────────┐ NIS ┌───────────┐ LDAP ┌────────────────────┐ │ Client │─────────│ Gateway │──────────│ Directory Server │ └──────────┘ └───────────┘ └────────────────────┘ The links in this diagram represent network traffic. The client uses the NIS protocol to communicate with the gateway, and the gateway uses the LDAP protocol to communicate with the directory server. This implementation requires that the gateway be robust against variations in directory server availability, be flexible enough to use any of a number of methods of authenticating to the directory server, and may additionally require the presence of specific extensions on the server in order to be able to be even reasonably certain of consistency with the directory's contents. In order to sidestep these requirements, and the complexity they add to an implementation, we decided to implement the gateway as a plugin. As a plugin, the gateway starts and stops with the directory server, it does not need to authenticate as a normal client would, and it can be expected to work with a server which can use it. Taking just the gateway and directory server portions of the above diagram, and breaking them down further, we can come to this: ┌──────────────┐ ┌─────────┐ ┌────────────────────────────┐ │ NIS Protocol │───│ Mapping │───│ Directory Server Back Ends │ └──────────────┘ └─────────┘ └────────────────────────────┘ The links in this diagram are now API calls. We've relegated the work of reading a query (parsed from the NIS client by the NIS Protocol handler), converting that query to a directory server search operation, and marshalling the results of that search into a format suitable for transmission as a NIS response, all to the Mapping module. The directory server back ends are exposed by SLAPI, of course. This approach does have its problems, though. NIS, as a protocol, requires that the server be able to supply a few bits of information which can't readily (or shouldn't) be retrieved this way, information which is not normally kept in a directory server. NIS requires that a server be able to report a revision number for a map, which is used as an indicator of the time when the map was last modified. A slave server can use this information to poll for changes in map contents on the master, possibly beginning a full map enumeration to read those new contents in order to serve its clients. A directory server, if it stores revision information at all, stores it on a per-entry basis. So when a gateway designed as we diagrammed above is asked for this information, it has at least these options: a) always use the current time - This causes frequent map updates on clients when they don't need them, and completely unnecessary network traffic. b) always use the same value - This keeps clients from ever noticing that a map has changed. c) return the latest revision of any of the results which formed the contents of the map - This could severely load a directory server if the information needs to be generated by reading the last-modified timestamps from many directory server entries. NIS also requires that a server be able to answer whether or not it services a specified domain, and which maps it serves for a domain that it serves. While the mapping module could search the directory's configuration space whenever it is asked these questions, the first question is asked regularly by each running copy of ypbind, which could also bog servers down (though admittedly, less than the previous case). If we break the mapping portion up further, we can introduce a map cache. In this module we can maintain a cache of the NIS server's data set, taking care to construct it at startup-time, updating it as the contents of the directory server change, and always serving clients using data from the cache. ┌──────────────┐ ┌───────────┐ ┌──────────────┐ ┌──────┐ │ NIS Protocol │──│ Map Cache │──│ Map Back End │──│ Data │ └──────────────┘ └───────────┘ └──────────────┘ └──────┘ Which takes us to the current design. The NIS protocol handler reads data from the map cache, and the map back end uses SLAPI to obtain data which is used to populate the map cache at startup-time, as well as to watch for changes in the directory's contents which would need to be reflected in the map cache. = Components = == Protocol Handler == This NIS protocol handler module takes the opportunity to set up listening sockets (listening on the port specified in the plugin configuration entry's "nsslapd-pluginarg0" attribute, or an unused port if none is specified) and register with the local portmapper at module initialization time. The plugin then starts a listening thread to service its clients. The plugin listens for datagram queries from clients, processing them as they come in, as well as accepting connections from clients. Because connected clients may not always transmit an entire request at once, and because the server may find itself unable to transmit an entire response at once, it buffers traffic for connected clients, multiplexing the work it does for all of its clients from inside of its thread. The actual protocol datagram parsing is performed by libnsl, which is provided as a part of the C library. Datagram responses which exceed the "nis-max-dgram-size" threshold (by default, 1024 bytes, or 1 kilobyte) are simply dropped. Response records larger than "nis-max-value-size" (by default, 256 kilobytes) are also ignored, even for connected clients. Client access is limited by the local tcp_wrappers configuration on the directory server, with a tcp_wrappers service name as dictated by the "nis-tcp-wrappers-name" attribute (by default, "nis-plugin") in the plugin's configuration. If the tcp_wrappers configuration denies access for the client, a connected client's connection will be closed, and a datagram client's request will be discarded. Client requests are also limited based on a client's address using "securenet"-style settings in the module's configuration entry's "nis-securenet" attribute. If no values are specified, access is allowed to all clients. If the securenet configuration denies access for the client, a connected client's connection will be closed, and a datagram client's request will be discarded. Client requests are further classed as "secure" or not, based on the query's originating port. This information is used elsewhere for additional access control. == NIS Layer == The NIS layer processes complete requests, whether they come in from connected or datagram clients, fetches the requested information from the map cache, and uses callbacks provided by the protocol handler to respond. Before doing so, if a value is being retrieved from a map, it checks if the map's contents are restricted to "secure" clients. If they are, but the client is not a "secure" client, the NIS layer will respond as if no data were present in the map. == Map Cache == The map cache keeps a dynamically-constructed set of maps in memory, grouped by domain, and for each map maintains information regarding the last time its contents were modified (to answer client requests for a map's order) and whether or not the map's contents should be restricted to "secure" clients. The map cache can quickly answer whether or not a domain is being served by checking whether or not any maps are defined for it. The definitions of which maps are served for which domains is configurable via internal APIs -- the map cache itself has no forehand knowledge of domain names, map names, or formats, as it merely models data in the way that a conventional NIS server might. Forcing queries to use the cache provides a couple of benefits over an alternate approach of performing an LDAP query for each NIS query: * While the directory server is generally only case-preserving, the NIS server can be case-sensitive, which is preferred by NIS clients and a requirement for some customers. * Because the query used is never used to construct an LDAP filter or query, we don't have to worry about escaping text to avoid string injection attacks. === Internal Representation === At the topmost level, the map cache is a table. Each entry in the table is the name of a domain and a table of maps. Each entry in a domain's table of maps contains the map's name, the time the map was last modified, a note indicating whether or not the map is a "secure" map, a linked list of map entries, and a set of indexes into the list. Each map can also hold a data pointer on behalf of the backend. Each item in the map's list of entries contains an array of NIS keys, an array of corresponding values, a unique identifier (which, currently, stores the NDN of the directory server entry which was used to create this list item) and a data pointer which is kept on behalf of the backend. The map indexes its entry list using an entry's unique identifier, and each of its keys. == Back End == The backend interface module sets up, populates, and maintains the map cache. At startup time, it configures the map cache with the list of domains and maps, and populates the maps with initial data. Using postoperation plugin hooks, the backend interface also notes when entries are added, modified, renamed (modrdn'd), or deleted from the directory server. It uses this information to create or destroy maps in the map cache, and to add, remove, or update entries in the map cache's maps, thereby ensuring that the map cache always reflects the current contents of the directory server. The backend interface reads the configuration it should use for the map cache from its configuration area in the directory server. Beneath the plugin's entry, the backend checks for entries with these attributes: * nis-domain * nis-map * nis-secure * nis-base * nis-filter * nis-key-format * nis-keys-format * nis-value-format * nis-values-format The backend then instructs the map cache to prepare to hold a map in the given domain (or domains) with the given map name (or names), and then performs a subtree search under the specified base (or bases, if there's more than one "nis-base" value) for entries which match the provided filter. Each entry found is then "added" to the map, using the format specifiers stored in the "nis-key-format" and "nis-keys-format" attributes to construct the keys for the entry in the map, with the corresponding value in the map being constructed using the format specifiers stored in the "nis-value-format" and "nis-values-format" attributes. The map is also marked as a "secure" map according to the "nis-secure" attribute, if so set. For each "nis-key-format" value, exactly one entry will be created in a NIS map. (If a "nis-key-format" does not yield a single value, the directory server entry will not appear in the NIS map.) For each "nis-keys-format" value, any number of entries will be created in a NIS map. The method by which these attributes (and the "nis-value-format" and "nis-value-formats") are interpreted is described below. Should one of the directory server entries which was used to construct one or more NIS map entries be modified or removed, the corresponding entries in every applicable NIS map are updated or removed. Likewise, if an entry is added to the directory server which would correspond to an entry in a NIS map, entries are created in the corresponding NIS maps. == Formatting Data for NIS == The "nis-key-format" and "nis-value-format" specifiers resemble an RPM format specifier, and can include the values of multiple attributes in any part of the specifier. The backend composes the string using the attribute values stored in the directory server entry, using the format specifier as a guide. In this way, the NIS map's contents can be constructed to almost any specification, and can make use of data stored using any schema. An example specification for the "nis-value-format" for a user's entry could look something like this: %{uid}:%{userPassword:-*}:%{uidNumber}:%{gidNumber}:%{gecos:-%{cn:-}}:%{homeDirectory}:%{loginShell:-/bin/sh} The syntax borrows from RPM's syntax, which in turn borrows from shell syntax, to allow the specification of alternate values to be used when the directory server entry doesn't include a "userPassword" or "gecos" attribute. Additional operators include "#", "##", "%", "%%", "/", "//", which operate in ways similar to their shell counterparts (with one notable exception being that patterns for the "/" operator can not currently be anchored to the beginning or end of the string). A format specifier can actually be interpreted in two ways: it can be interpreted as a single value, or it can be interpreted as providing a list of values. When the format specifier is being interpreted as a single value, any reference to an attribute value which does not also specify an alternate value will cause the directory server entry to be ignored if the referenced attribute has no value defined for that entry, or contains multiple values. In the above example, the entry would be ignored if the "uid", "uidNumber", "gidNumber", or "homeDirectory" attributes of the entry did not each contain exactly one value. The syntax further defines "functions" which can be used to concatenate lists of multiple values into a single result, for example for groups: %{cn}:%{userPassword:-*}:%{gidNumber}:%merge(",","%{memberUid}") This filter takes advantage of a built-in "merge" function, which processes zero or more single or list values and concatenates them together with a "," separator, to generate the list of group members. The "nis-filter", "nis-key-format", and "nis-value-format" settings have sensible defaults for the maps which we expect to be commonly used -- this is important because it's easy to subtly construct malformed result specifiers which could trigger undefined behavior on clients -- for example by leaving the user's numeric UID empty in a passwd entry, which may be treated as "0" by inattentive clients. A function-like invocation expects a comma-separated list of double-quoted arguments. Any arguments which contain a double-quote need to escape the double-quote using a '\' character. Naturally the '\' this character itself also needs to be escaped whenever it appears. === Implemented Functions === * first(PATTERN[,DEFAULT]) - Evaluates the pattern, and if one or more values is available, provides the first value. If no values result, then DEFAULT is evaluated as a pattern and the result is provided. * match(EXPRESSION,PATTERN[,DEFAULT]) - Selects the value of EXPRESSION which matches the globbing pattern PATTERN. If zero or two or more values match, and a DEFAULT was specified, the DEFAULT is produced, otherwise an error occurs. - Example (examining "cn=group"): dn: cn=group member: bob member: dave %match("%{member}","b*") -> bob %match("%{member}","d*") -> dave %match("%{member}","e*") FAILS %match("%{member}","e*","jim") -> jim %match("%{member}","*","jim") -> jim (when a single value is required) %match("%{member}","*","jim") -> bob,dave (when a list is acceptable) * regmatch(EXPRESSION,PATTERN[,DEFAULT]) - Selects the value of EXPRESSION which matches the extended regular expression PATTERN. If zero or two or more values match, and a DEFAULT was specified, the DEFAULT is produced, otherwise an error occurs. - Example (examining "cn=group"): dn: cn=group member: bob member: dave %regmatch("%{member}","^b.*") -> bob %regmatch("%{member}","^d.*") -> dave %regmatch("%{member}","e") -> dave %regmatch("%{member}","^e") FAILS %regmatch("%{member}","^e.*","jim") -> jim %regmatch("%{member}",".*","jim") -> jim (when a single value is required) %regmatch("%{member}",".*","jim") -> bob,dave (when a list is acceptable) * regsub(EXPRESSION,PATTERN,TEMPLATE[,DEFAULT]) - Selects the value of EXPRESSION which matches the extended regular expression PATTERN and uses TEMPLATE to construct the output. If zero or two or more values match, and a DEFAULT was specified, the DEFAULT is produced, otherwise an error occurs. The template is used to construct a result using the n'th substring from the matched value by using the sequence "%n" in the template. - Example (examining "cn=group"): dn: cn=group member: bob member: dave %regsub("%{member}","o","%0") -> bob %regsub("%{member}","o","%1") -> %regsub("%{member}","^o","%0") FAILS %regsub("%{member}","^d(.).*","%1") -> a %regsub("%{member}","^(.*)e","t%1y") -> tdavy %regsub("%{member}","^e","%1") FAILS %regsub("%{member}","^e.*","%1","jim") -> jim * deref(THISATTRIBUTE,THATATTRIBUTE) - Creates a separated list of the values of THATATTRIBUTE for directory entries named by this entry's THISATTRIBUTE. Will fail if used in a context where a single value is required. - Example (examining "cn=group"): dn: cn=group member: uid=bob member: uid=pete - dn: uid=bob uid: bob - dn: uid=pete uid: pete %deref(",","member","foo") -> FAIL (when a single value is required) %deref(",","member","foo") -> (when a list is acceptable) %deref(",","member","uid") -> FAIL (when a single value is required) %deref(",","member","uid") -> bob,pete (when a list is acceptable) * referred(MAP,THATATTRIBUTE,THATOTHERATTRIBUTE) - Creates a separated list of the values of THATOTHERATTRIBUTE for directory entries which have entries in the current domain in the named MAP and which also have this entry's name as a value for THATATTRIBUTE. Will fail if used in a context where a single value is required. - Example (examining "cn=group"): dn: cn=group - dn: uid=bob uid: bob memberOf: cn=group - dn: uid=pete uid: pete memberOf: cn=group %referred(",","memberof","foo") -> FAIL (when a single value is required) %referred(",","memberof","foo") -> (when a list is acceptable) %referred(",","memberof","uid") -> FAIL (when a single value is required) %referred(",","memberof","uid") -> bob,pete (when a list is acceptable) * merge(SEPARATOR,EXPRESSION[,...]) - Evaluates and then creates a list using multiple expressions which can evaluate to either single values or lists. - Example (examining "cn=group"): dn: cn=group membername: jim member: uid=bob member: uid=pete - dn: uid=bob uid: bob - dn: uid=pete uid: pete %merge(",","%{membername}","%deref(\"member\",\"uid\")") -> jim,bob,pete