summaryrefslogtreecommitdiffstats
path: root/doc/nis-design.txt
blob: 40cbb653060553ac5dfbb08af71605bf202aaebd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
= Design Overview =

The NIS Server plugin's aim is to serve up data from the directory
server using the NIS protocols.  It does this by doing what any gateway
would do: it queries the directory server for entries which would
correspond to the contents of maps, reads the contents of various
attributes from those entries, and uses that data to synthesize entries
for maps which it serves to clients.

In broad strokes, one design might look like this:

   ┌──────────┐   NIS   ┌───────────┐   LDAP   ┌────────────────────┐
   │  Client  │─────────│  Gateway  │──────────│  Directory Server  │
   └──────────┘         └───────────┘          └────────────────────┘

The links in this diagram represent network traffic.  The client uses
the NIS protocol to communicate with the gateway, and the gateway uses
the LDAP protocol to communicate with the directory server.

This implementation requires that the gateway be robust against
variations in directory server availability, be flexible enough to use
any of a number of methods of authenticating to the directory server,
and may additionally require the presence of specific extensions on the
server in order to be able to be even reasonably certain of consistency
with the directory's contents.

In order to sidestep these requirements, and the complexity they add to
an implementation, we decided to implement the gateway as a plugin.  As
a plugin, the gateway starts and stops with the directory server, it
does not need to authenticate as a normal client would, and it can be
expected to work with a server which can use it.

Taking just the gateway and directory server portions of the above
diagram, and breaking them down further, we can come to this:

   ┌──────────────┐   ┌─────────┐   ┌────────────────────────────┐
   │ NIS Protocol │───│ Mapping │───│ Directory Server Back Ends │
   └──────────────┘   └─────────┘   └────────────────────────────┘

The links in this diagram are now API calls.  We've relegated the work
of reading a query (parsed from the NIS client by the NIS Protocol
handler), converting that query to a directory server search operation,
and marshalling the results of that search into a format suitable for
transmission as a NIS response, all to the Mapping module.  The
directory server back ends are exposed by SLAPI, of course.

This approach does have its problems, though.

NIS, as a protocol, requires that the server be able to supply a few
bits of information which can't readily (or shouldn't) be retrieved this
way, information which is not normally kept in a directory server.

NIS requires that a server be able to report a revision number for a
map, which is used as an indicator of the time when the map was last
modified.  A slave server can use this information to poll for changes
in map contents on the master, possibly beginning a full map enumeration
to read those new contents in order to serve its clients.

A directory server, if it stores revision information at all, stores
it on a per-entry basis.  So when a gateway designed as we diagrammed
above is asked for this information, it has at least these options:
  a) always use the current time
     - This causes frequent map updates on clients when they don't need
       them, and completely unnecessary network traffic.
  b) always use the same value
     - This keeps clients from ever noticing that a map has changed.
  c) return the latest revision of any of the results which formed the
     contents of the map
     - This could severely load a directory server if the information
       needs to be generated by reading the last-modified timestamps
       from many directory server entries.

NIS also requires that a server be able to answer whether or not it
services a specified domain, and which maps it serves for a domain that
it serves.  While the mapping module could search the directory's
configuration space whenever it is asked these questions, the first
question is asked regularly by each running copy of ypbind, which could
also bog servers down (though admittedly, less than the previous case).

If we break the mapping portion up further, we can introduce a map
cache.  In this module we can maintain a cache of the NIS server's data
set, taking care to construct it at startup-time, updating it as the
contents of the directory server change, and always serving clients
using data from the cache.

   ┌──────────────┐  ┌───────────┐  ┌──────────────┐  ┌──────┐
   │ NIS Protocol │──│ Map Cache │──│ Map Back End │──│ Data │
   └──────────────┘  └───────────┘  └──────────────┘  └──────┘

Which takes us to the current design.  The NIS protocol handler reads
data from the map cache, and the map back end uses SLAPI to obtain data
which is used to populate the map cache at startup-time, as well as to
watch for changes in the directory's contents which would need to be
reflected in the map cache.

= Components =

== Protocol Handler ==

This NIS protocol handler module takes the opportunity to set up
listening sockets (listening on the port specified in the plugin
configuration entry's "nsslapd-pluginarg0" attribute, or an unused port
if none is specified) and register with the local portmapper at module
initialization time.  The plugin then starts a listening thread to
service its clients.

The plugin listens for datagram queries from clients, processing them as
they come in, as well as accepting connections from clients.  Because
connected clients may not always transmit an entire request at once, and
because the server may find itself unable to transmit an entire response
at once, it buffers traffic for connected clients, multiplexing the work
it does for all of its clients from inside of its thread.  The actual
protocol datagram parsing is performed by libnsl, which is provided as a
part of the C library.

Datagram responses which exceed the "nis-max-dgram-size" threshold (by
default, 1024 bytes, or 1 kilobyte) are simply dropped.  Response
records larger than "nis-max-value-size" (by default, 256 kilobytes) are
also ignored, even for connected clients.

Client access is limited by the local tcp_wrappers configuration on the
directory server, with a tcp_wrappers service name as dictated by the
"nis-tcp-wrappers-name" attribute (by default, "nis-plugin") in the
plugin's configuration.  If the tcp_wrappers configuration denies access
for the client, a connected client's connection will be closed, and a
datagram client's request will be discarded.

Client requests are also limited based on a client's address using
"securenet"-style settings in the module's configuration entry's
"nis-securenet" attribute.  If no values are specified, access is
allowed to all clients.  If the securenet configuration denies access
for the client, a connected client's connection will be closed, and a
datagram client's request will be discarded.

Client requests are further classed as "secure" or not, based on the
query's originating port.  This information is used elsewhere for
additional access control.

== NIS Layer ==

The NIS layer processes complete requests, whether they come in from
connected or datagram clients, fetches the requested information from
the map cache, and uses callbacks provided by the protocol handler to
respond.  Before doing so, if a value is being retrieved from a map, it
checks if the map's contents are restricted to "secure" clients.  If
they are, but the client is not a "secure" client, the NIS layer will
respond as if no data were present in the map.

== Map Cache ==

The map cache keeps a dynamically-constructed set of maps in memory,
grouped by domain, and for each map maintains information regarding the
last time its contents were modified (to answer client requests for a
map's order) and whether or not the map's contents should be restricted
to "secure" clients.  The map cache can quickly answer whether or not a
domain is being served by checking whether or not any maps are defined
for it.  The definitions of which maps are served for which domains is
configurable via internal APIs -- the map cache itself has no forehand
knowledge of domain names, map names, or formats, as it merely models
data in the way that a conventional NIS server might.

Forcing queries to use the cache provides a couple of benefits over an
alternate approach of performing an LDAP query for each NIS query:
* While the directory server is generally only case-preserving, the NIS
  server can be case-sensitive, which is preferred by NIS clients and
  a requirement for some customers.
* Because the query used is never used to construct an LDAP filter or
  query, we don't have to worry about escaping text to avoid string
  injection attacks.

=== Internal Representation ===

At the topmost level, the map cache is a table.  Each entry in the table
is the name of a domain and a table of maps.

Each entry in a domain's table of maps contains the map's name, the time
the map was last modified, a note indicating whether or not the map is a
"secure" map, a linked list of map entries, and a set of indexes into
the list.  Each map can also hold a data pointer on behalf of the
backend.

Each item in the map's list of entries contains an array of NIS keys, an
array of corresponding values, a unique identifier (which, currently,
stores the NDN of the directory server entry which was used to create
this list item) and a data pointer which is kept on behalf of the
backend.

The map indexes its entry list using an entry's unique identifier, and
each of its keys.

== Back End ==

The backend interface module sets up, populates, and maintains the map
cache.  At startup time, it configures the map cache with the list of
domains and maps, and populates the maps with initial data.  Using
postoperation plugin hooks, the backend interface also notes when
entries are added, modified, renamed (modrdn'd), or deleted from the
directory server.  It uses this information to create or destroy maps in
the map cache, and to add, remove, or update entries in the map cache's
maps, thereby ensuring that the map cache always reflects the current
contents of the directory server.

The backend interface reads the configuration it should use for the map
cache from its configuration area in the directory server.  Beneath the
plugin's entry, the backend checks for entries with these attributes:
 * nis-domain
 * nis-map
 * nis-secure
 * nis-base
 * nis-filter
 * nis-key-format
 * nis-keys-format
 * nis-value-format
 * nis-values-format
The backend then instructs the map cache to prepare to hold a map in the
given domain (or domains) with the given map name (or names), and then
performs a subtree search under the specified base (or bases, if there's
more than one "nis-base" value) for entries which match the provided
filter.  Each entry found is then "added" to the map, using the format
specifiers stored in the "nis-key-format" and "nis-keys-format"
attributes to construct the keys for the entry in the map, with the
corresponding value in the map being constructed using the format
specifiers stored in the "nis-value-format" and "nis-values-format"
attributes.  The map is also marked as a "secure" map according to the
"nis-secure" attribute, if so set.

For each "nis-key-format" value, exactly one entry will be created in a
NIS map.  (If a "nis-key-format" does not yield a single value, the
directory server entry will not appear in the NIS map.)  For each
"nis-keys-format" value, any number of entries will be created in a NIS
map.  The method by which these attributes (and the "nis-value-format"
and "nis-value-formats") are interpreted is described below.

Should one of the directory server entries which was used to construct
one or more NIS map entries be modified or removed, the corresponding
entries in every applicable NIS map are updated or removed.  Likewise,
if an entry is added to the directory server which would correspond to
an entry in a NIS map, entries are created in the corresponding NIS
maps.

== Formatting Data for NIS ==

The "nis-key-format" and "nis-value-format" specifiers resemble an RPM
format specifier, and can include the values of multiple attributes in
any part of the specifier.  The backend composes the string using the
attribute values stored in the directory server entry, using the format
specifier as a guide.  In this way, the NIS map's contents can be
constructed to almost any specification, and can make use of data stored
using any schema.

An example specification for the "nis-value-format" for a user's entry
could look something like this:
  %{uid}:%{userPassword:-*}:%{uidNumber}:%{gidNumber}:%{gecos:-%{cn:-}}:%{homeDirectory}:%{loginShell:-/bin/sh}
The syntax borrows from RPM's syntax, which in turn borrows from shell
syntax, to allow the specification of alternate values to be used when
the directory server entry doesn't include a "userPassword" or "gecos"
attribute.  Additional operators include "#", "##", "%", "%%", "/",
"//", which operate in ways similar to their shell counterparts (with
one notable exception being that patterns for the "/" operator can not
currently be anchored to the beginning or end of the string).

A format specifier can actually be interpreted in two ways: it can be
interpreted as a single value, or it can be interpreted as providing a
list of values.  When the format specifier is being interpreted as a
single value, any reference to an attribute value which does not also
specify an alternate value will cause the directory server entry to be
ignored if the referenced attribute has no value defined for that entry,
or contains multiple values.  In the above example, the entry would be
ignored if the "uid", "uidNumber", "gidNumber", or "homeDirectory"
attributes of the entry did not each contain exactly one value.

The syntax further defines "functions" which can be used to concatenate
lists of multiple values into a single result, for example for groups:
  %{cn}:%{userPassword:-*}:%{gidNumber}:%merge(",","%{memberUid}")
This filter takes advantage of a built-in "merge" function, which
processes zero or more single or list values and concatenates them
together with a "," separator, to generate the list of group members.

The "nis-filter", "nis-key-format", and "nis-value-format" settings have
sensible defaults for the maps which we expect to be commonly used --
this is important because it's easy to subtly construct malformed result
specifiers which could trigger undefined behavior on clients -- for
example by leaving the user's numeric UID empty in a passwd entry, which
may be treated as "0" by inattentive clients.

A function-like invocation expects a comma-separated list of
double-quoted arguments.  Any arguments which contain a double-quote
need to escape the double-quote using a '\' character.  Naturally the
'\' this character itself also needs to be escaped whenever it appears.

=== Implemented Functions ===
  * first(PATTERN[,DEFAULT])
    - Evaluates the pattern, and if one or more values is available,
      provides the first value.  If no values result, then DEFAULT is
      evaluated as a pattern and the result is provided.
  * match(EXPRESSION,PATTERN[,DEFAULT])
    - Selects the value of EXPRESSION which matches the globbing pattern
      PATTERN.  If zero or two or more values match, and a DEFAULT was
      specified, the DEFAULT is produced, otherwise an error occurs.
    - Example (examining "cn=group"):
        dn: cn=group
	member: bob
	member: dave
      %match("%{member}","b*")       -> bob
      %match("%{member}","d*")       -> dave
      %match("%{member}","e*")        FAILS
      %match("%{member}","e*","jim") -> jim
      %match("%{member}","*","jim")  -> jim (when a single value is required)
      %match("%{member}","*","jim")  -> bob,dave (when a list is acceptable)
  * regmatch(EXPRESSION,PATTERN[,DEFAULT])
    - Selects the value of EXPRESSION which matches the extended regular
      expression PATTERN.  If zero or two or more values match, and a
      DEFAULT was specified, the DEFAULT is produced, otherwise an error
      occurs.
    - Example (examining "cn=group"):
        dn: cn=group
	member: bob
	member: dave
      %regmatch("%{member}","^b.*")       -> bob
      %regmatch("%{member}","^d.*")       -> dave
      %regmatch("%{member}","e")          -> dave
      %regmatch("%{member}","^e")          FAILS
      %regmatch("%{member}","^e.*","jim") -> jim
      %regmatch("%{member}",".*","jim")   -> jim (when a single value is required)
      %regmatch("%{member}",".*","jim")   -> bob,dave (when a list is acceptable)
  * regsub(EXPRESSION,PATTERN,TEMPLATE[,DEFAULT])
    - Selects the value of EXPRESSION which matches the extended regular
      expression PATTERN and uses TEMPLATE to construct the output.  If
      zero or two or more values match, and a DEFAULT was specified, the
      DEFAULT is produced, otherwise an error occurs.  The template is
      used to construct a result using the n'th substring from the
      matched value by using the sequence "%n" in the template.
    - Example (examining "cn=group"):
        dn: cn=group
	member: bob
	member: dave
      %regsub("%{member}","o","%0")          -> bob
      %regsub("%{member}","o","%1")          -> 
      %regsub("%{member}","^o","%0")         FAILS
      %regsub("%{member}","^d(.).*","%1")    -> a
      %regsub("%{member}","^(.*)e","t%1y")   -> tdavy
      %regsub("%{member}","^e","%1")         FAILS
      %regsub("%{member}","^e.*","%1","jim") -> jim
  * deref(THISATTRIBUTE,THATATTRIBUTE)
    - Creates a separated list of the values of THATATTRIBUTE for
      directory entries named by this entry's THISATTRIBUTE.  Will fail
      if used in a context where a single value is required.
    - Example (examining "cn=group"):
        dn: cn=group
	member: uid=bob
	member: uid=pete
        -
	dn: uid=bob
	uid: bob
        -
	dn: uid=pete
	uid: pete
      %deref(",","member","foo") -> FAIL     (when a single value is required)
      %deref(",","member","foo") ->          (when a list is acceptable)
      %deref(",","member","uid") -> FAIL     (when a single value is required)
      %deref(",","member","uid") -> bob,pete (when a list is acceptable)
  * referred(MAP,THATATTRIBUTE,THATOTHERATTRIBUTE)
    - Creates a separated list of the values of THATOTHERATTRIBUTE for
      directory entries which have entries in the current domain in the
      named MAP and which also have this entry's name as a value for
      THATATTRIBUTE.  Will fail if used in a context where a single
      value is required.
    - Example (examining "cn=group"):
        dn: cn=group
        -
	dn: uid=bob
	uid: bob
	memberOf: cn=group
        -
	dn: uid=pete
	uid: pete
	memberOf: cn=group
      %referred(",","memberof","foo") -> FAIL  (when a single value is required)
      %referred(",","memberof","foo") ->       (when a list is acceptable)
      %referred(",","memberof","uid") -> FAIL  (when a single value is required)
      %referred(",","memberof","uid") -> bob,pete (when a list is acceptable)
  * merge(SEPARATOR,EXPRESSION[,...])
    - Evaluates and then creates a list using multiple expressions which
      can evaluate to either single values or lists.
    - Example (examining "cn=group"):
        dn: cn=group
	membername: jim
	member: uid=bob
	member: uid=pete
        -
	dn: uid=bob
	uid: bob
        -
	dn: uid=pete
	uid: pete
      %merge(",","%{membername}","%deref(\"member\",\"uid\")") -> jim,bob,pete