summaryrefslogtreecommitdiffstats
path: root/doc/iprop-notes.txt
blob: 8efee36f6610d33489385ba204f77b5eb2e1c53e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
Some (intentional) changes from Sun's submission are noted in the
install guide.

Bugs or issues:

The "full resync" part of the protocol involves the master side firing
off a normal kprop (and going back to servicing requests), and the
slave side stopping all the incremental propagation stuff and waiting
for the kprop.  If the connection from the master never comes in for
some reason, the slave side just blocks forever, and never resumes
incremental propagation.

The protocol does not currently pass policy database changes; this was
an intentional decision on Sun's part.  The policy database is only
relevant to the master KDC, and is usually fairly static (aside from
refcount updates), but not propagating it does mean that a slave
maintained via iprop can't simply be promoted to a master in disaster
recovery or other cases without doing a full propagation or restoring
a database from backups.

Shawn had a good suggestion after I started the integration work, and
which I haven't had a chance to implement: Make the update-log code
fit in as a sort of pseudo-database layer via the DAL, being called
through the standard DAL methods, and doing its work around calls
through to the real database back end again through the DAL methods.
So for example, creating a "iprop+db2" database would create an update
log and the real db2 database; storing a principal entry would update
the update log as well; etc.  At least initially, we wouldn't treat it
as a differently-named database; the installation of the hooks would
be done by explicitly checking if iprop is enabled, etc.

The "iprop role" is assumed to be either master or slave.  The master
writes a log, and the slave fetches it.  But what about a cascade
propagation model where A sends to B which sends to C, perhaps because
A's bandwidth is highly limited, or B and C are co-located?  In such a
case, B would want to operate in both modes.  Granted, with iprop the
bandwidth issues should be less important, but there may still be
reasons one may wish to run in such a configuration.

The propagation of changes does not happen in real time.  It's not a
"push" protocol; the slaves poll periodically for changes.  Perhaps a
future revision of the protocol could address that.

kadmin/cli/kadmin.c call to kadm5_init_iprop - is this needed in
client-side program? Should it be done in libkadm5srv instead as part
of the existing kadm5_init* so that database-accessing applications
that don't get updated at the source level will automatically start
changing the update log as needed?

Locking: Currently DAL exports the DB locking interface to the caller;
we want to slip the iprop code in between -- run it plus the DB update
operation with the DB lock held, whether or not the caller grabbed the
lock.  (Does the caller always grab the lock before making changes?)
Currently we're using a file lock on the update log itself; this will
be independent of whether the DB back end implements locking (which
may be a good thing or a bad thing, depending).

Various logging calls with odd format strings like "<null>" should be
fixed.

Why are different principal names used, when incremental propagation
requires that normal kprop (which uses host principals) be possible
anyways?

Why is this tied to kadmind, aside from (a) wanting to prevent other
db changes, which locking protocols should deal with anyways, (b)
existing acl code, (c) existing server process?

The incremental propagation protocol requires an ACL entry on the
master, listing the slave.  Since the full-resync part uses normal
kprop, the slave also has to have an ACL entry for the master.  If
this is missing, I suspect the behavior will be that every two
minutes, the master side will (at the prompting of the slave) dump out
the database and attempt a full propagation.

Possible optimizations: If an existing dump file has a recent enough
serial number, just send it, without dumping again?  Use just one dump
file instead of one per slave?

Requiring normal kprop means the slave still can't be behind a NAT or
firewall without special configuration.  The incremental parts can
work in such a configuration, so long as outgoing TCP connections are
allowed.

Still limited to IPv4 because of limitations in MIT's version of the
RPC code.  (This could be fixed for kprop, if IPv6 sites want to do
full propagation only.  Doing incremental propagation over IPv6 will
take work on the RPC library, and probably introduce
backwards-incompatible ABI changes.)

Overflow checks for ulogentries times block size?

If file can't be made the size indicated by ulogentries, shoud we
truncate or error out?  If we error out, this could blow out when
resizing the log because of a too-large log entry.

The kprop invocation doesn't specify a realm name, so it'll only work
for the default realm.  No clean way to specify a port number, either.
Would it be overkill to come up with a way to configure host+port for
kpropd on the master?  Preferably in a way that'd support cascading
propagations.

The kadmind process, when it needs to run kprop, extracts the slave
host name from the client principal name.  It assumes that the
principal name will be of the form foo/hostname@REALM, and looks
specifically for the "/" and "@" to chop up the string form of the
name.  If looking up that name won't give a working IPv4 address for
the slave, kprop will fail (and kpropd will keep waiting, incremental
updates will stop, etc).

Mapping between file offsets and structure addresses, we should be
careful about alignment.  We're probably okay on current platforms,
but if we break log-format compatibility with Sun at some point, use
the chance to make the kdb_ent_header_t offsets be more strictly
aligned in the file.  (16 or 32 bytes?)

Not thread safe!  The kdb5.c code will get a lock on the update log
file while making changes, but the lock is per-process.  Currently
there are no processes I know of that use multiple threads and change
the database.  (There's the Novell patch to make the KDC
multithreaded, but the kdc-kdb-update option doesn't currently
compile.)

Logging in kpropd is poor to useless.  If there are any problems, run
it in debug mode ("-d").  You'll still lose all output from the
invocation of kdb5_util dump and kprop run out of kadmind.

Other man page updates needed: Anything with new -x options.

Comments from lha:

Verify both client and server are demanding privacy from RPC.

Authorization code in check_iprop_rpcsec_auth is weird.  Check realm
checking, is it trusting the client realm length?

What will happen if my realm is named "A" and I can get a cross realm
(though multihop) to ATHENA.MIT.EDU's iprop server?

Why is the ACL not applied before we get to the functions themselves?