=== Design Overview === The NIS plugin module's aim is to serve up data from the directory server using the NIS protocols. It does this by doing what any gateway would do: it queries the directory server for entries which would correspond to the contents of maps, reads the contents of various attributes from those entries, and uses that data to synthesize entries for maps which it serves to clients. In broad strokes, it might look like this: ┌──────────┐ NIS ┌───────────┐ LDAP ┌────────────────────┐ │ Client │─────────│ Gateway │──────────│ Directory Server │ └──────────┘ └───────────┘ └────────────────────┘ The links in this diagram represent network traffic. The client uses the NIS protocol to communicate with the gateway, and the gateway uses the LDAP protocol to communicate with the directory server. This implementation requires that the gateway be robust against variations in directory server availability, be flexible enough to use any of a number of methods of authenticating to the directory server, and may additionally require the presence of specific extensions on the server in order to be able to be even reasonably certain of consistency with the directory's contents. In order to sidestep these requirements, and the complexity they add to an implementation, we decided to implement the gateway as a plugin. As a plugin, the gateway starts and stops with the directory server, it does not need to authenticate as a normal client would, and it can be expected to work with a server which can use it. Taking just the gateway and directory server portions of the above diagram, and breaking them down further, we can come to this: ┌──────────────┐ ┌─────────┐ ┌────────────────────────────┐ │ NIS Protocol │───│ Mapping │───│ Directory Server Back Ends │ └──────────────┘ └─────────┘ └────────────────────────────┘ The links in this diagram are all API calls. We've relegated the work of reading a query (parsed from the NIS client by the NIS Protocol handler), converting that query to a directory server search operation, and marshalling the results of that search into a format suitable for transmission as a NIS response, all to the Mapping module. The directory server back ends are exposed by SLAPI, of course. This approach does have its problems, though. NIS, as a protocol, requires that the server be able to supply a few bits of information which can't readily (or shouldn't) be retrieved this way. NIS requires that a server be able to report a revision number for a map, akin to the serial number used in a DNS SOA record. A slave server can use this information to poll for changes in map contents on the master, possibly beginning a full map enumeration to read those new contents in order to serve its clients. A directory server, if it stores revision information at all, stores it on a per-entry basis. So when a gateway designed as we diagrammed above is asked for this information, it has at least these options: a) use an ever-increasing value, such as the current time - This causes frequent map updates on clients when they don't need them, and completely unnecessary network traffic. b) always use the same value - This keeps clients from ever noticing that a map has changed. c) return the latest revision of any of the results which formed the contents of the map - This could severely load a directory server if the information needs to be generated dynamically and frequently. NIS also requires that a server be able to answer whether or not it services a specified domain, and which maps it serves for a domain that it serves. While the mapping module could search the directory's configuration space whenever it is asked these questions, the first question is asked repeatedly by each running copy of ypbind, which could also bog servers down (though admittedly, less than the previous case). If we break the mapping portion up further, we can introduce a map cache. In this module we can maintain a cache of the NIS server's data set, taking care to construct it at startup-time, updating it as the contents of the directory server change, and always serving clients using data from the cache. ┌──────────────┐ ┌───────────┐ ┌──────────────┐ ┌──────┐ │ NIS Protocol │──│ Map Cache │──│ Map Back End │──│ Data │ └──────────────┘ └───────────┘ └──────────────┘ └──────┘ Which takes us to the current design. The NIS protocol handler reads data from the map cache, and the map back end uses SLAPI to populate the map cache at startup-time, as well as to watch for changes in the directory's contents which would need to be reflected in the map cache. === Components === == Protocol Handler == This NIS protocol handler module takes the opportunity to set up listening sockets and register with the local portmapper at module initialization time. (It does so at this point because the directory server has not yet dropped privileges, and the portmapper will not allow registrations to unprivileged clients.) The plugin then starts a listening thread to handle its clients. [The plugin listens for datagram queries from clients, processing them as they come in, as well as answering connections from clients. Because connected clients may not always transmit an entire request at once, and because the server may find itself unable to transmit an entire response at once, it buffers traffic for connected clients, multiplexing the work it does for all of its clients from inside of the thread.] The actual protocol datagram parsing is performed by libnsl, which is provided as a part of the C library. [Unless explicitly disabled in the module's configuration or in a map's configuration, the local /etc/securenets file is consulted to control access to map information to specific clients. The list of securenet entries can also be stored in the module or map.] == Map Cache == The map cache keeps a dynamically-constructed set of maps in memory, grouped by domain, and for each map maintains information regarding the last time its contents were modified (to answer client requests for a map's order). The map cache can quickly answer whether or not a domain is being served by checking whether or not any maps are defined for it. The definitions of which maps are served for which domains is configurable via internal APIs -- the map cache itself has no forehand knowledge of domain names, map names, or formats, as it merely models data in the way that a NIS server might. [While currently the cache is implemented in a prototyping-friendly list of lists structure, I anticipate that searching through maps of larger than trivial size will be expensive enough that a better internal representation will need to be used. The main requirement from the NIS protocol-handler side is the ability to find a given key and/or the successor key for a given key, and their matching data items. The backend requires that the cache also be able to track one or more DNs which are relevant to the value which is being stored for a given key in the map, so that it can be updated if a directory entry with that DN is added, removed, modified, or renamed.] Forcing queries to use the cache provides a couple of benefits over an alternate approach of performing an LDAP query for each NIS query: * While the directory server is generally only case-preserving, the NIS server can be case-sensitive, which is preferred by NIS clients and a requirement for some customers. * Because the query used is never used to construct an LDAP filter or query, we don't have to worry about escaping text to avoid string injection attacks. == Back End == The backend interface module sets up, populates, and maintains the map cache. At startup time, it configures the map cache with the list of domains and maps, and populates the maps with initial data. Using postoperation plugin hooks, the backend interface also notes when entries are added, modified, renamed (modrdn'd), or deleted from the directory server. It uses this information to [create or destroy maps in the map cache, and to] add, remove, or update entries in the map cache's maps, thereby ensuring that the map cache always reflects the current contents of the directory server. The backend interface reads the configuration it should use for the map cache from its configuration area in the directory server. Beneath the plugin's entry, the backend checks for entries with these attributes: * domain * map * base * filter * key[Format] * value[Format] The backend then instructs the map cache to prepare to hold a map in the given domain with the given map name, and then performs a subtree search under the specified base for entries which match the provided filter. Each found entry is then "added" to the map, using the value of the attribute named by the "key[Format]" as the key for the entry in the map, with the corresponding value in the map being the value of the attribute named by the "value" [being constructed using the format specifier given as the "valueFormat"]. The "valueFormat" specifier resembles an RPM format specifier, and can include the values of multiple attributes in any part of the specifier. The backend composes the string using the attribute values stored in the directory server entry, using the format specifier as a guide. In this way, the NIS map's contents can be constructed to almost any specification, can make use of data stored using any schema. An example specification for a user's entry would look like this: %{uid}:%{userPassword:-*}:%{uidNumber}:%{gidNumber}:%{gecos:-%{cn:-}}:%{homeDirectory}:%{loginShell:-/bin/sh} The syntax borrows from RPM's syntax, which in turn borrows from shell syntax, to allow the specification of alternate values to be used when the directory server entry doesn't include a "userPassword" or "gecos" attribute. To ensure safety, any reference to an attribute value which does not also specify an alternate value will cause the directory server entry to be ignored if the referenced attribute has no value defined for that entry, or contains multiple values. In the above example, the entry would be ignored if the "uid", "uidNumber", "gidNumber", or "homeDirectory" attributes of the entry did not each contain exactly one value. The syntax further defines "functions" which can be used to concatenate lists of multiple values into a single result, for example for groups: %{cn}:%{userPassword:-*}:%{gidNumber}:%list{",","memberUid"}) This filter takes advantage of a built-in "list" function, which processes zero or more values of the "memberUid" attribute and concatenates them together with a "," separator, to generate the list of group members. [The filter, key, and value have sensible defaults for the maps which we expect to be using -- this is important because it's easy to subtly construct malformed result strings which could trigger undefined behavior on clients -- for example by leaving the user's numeric UID empty in a passwd entry, which may be treated as "0" by inattentive clients.] [The format specifier includes function-like invocations to allow the backend to be instructed to chase references to other entries, for example to handle flattening of nested groups or netgroups.]