1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
|
DESCRIPTION
===========
The CGroup Rules Engine Daemon is a tool that will automatically place tasks
into the correct cgroup based on UID/GID events from the kernel. It will not
automatically classify tasks that are already running, but it will classify
any new tasks, and any tasks which change their UID/GID. Note that we use
the euid and egid, not the ruid and rgid.
Unlike other tools, cgrulesengd caches the rules configuration in a data
structure (it's actually just a FIFO linked list) so that it doesn't need to
parse the configuration file more than once. This should be much faster than
parsing the rules for each UID/GID event. Eventually, this caching logic
should be part of libcgroup, so that any other program can take advantage of it
(and so that all programs are using the same table). The configuration can
be reloaded without stopping the daemon (more information below).
WHY A DAEMON?
=============
A daemon is easy to use, and allows an administrator to ensure that all tasks
are classified into the correct cgroups without constantly monitoring the
system. The daemon is transparent to the users, and does not require any
modifications to existing userspace programs. Finally, the daemon can be
started and stopped at any time, including at boot time with other services.
Thus, sytem administrators can decide not to use the daemon if they choose.
Most importantly, some programs create new users and/or run scripts,
threads, etc. as those users using suexec(). This call does not go through
PAM, so these scripts would continue running in the same cgroup as the parent
program. This behavior is likely not ideal, and the daemon would solve this
problem.
Apache does this. Apache creates a user called 'apache' and uses setuid() to
launch tasks as that user. This does not go through PAM, so without a daemon,
these tasks would continue to run in the 'root' cgroup rather than in the
'apache' or 'webserver' cgroup. The daemon fixes this problem by catching the
setuid() call and moving the tasks into the correct cgroup.
We would ask Apache to modify their software to interface with libcgroup, but
this solution is less than optimal because a lot of userspace software would
have to be changed, and some authors might intentionally not interact with
libcgroup, which could create an exploit. The daemon is a simple, transparent
solution.
USING THE DAEMON
================
The daemon can be used as a service with the cgred script, which is shipped
as scripts/init.d/cgred. This script should be installed as /etc/init.d/cgred
and used like any other service. To start the daemon,
/etc/init.d/cgred start
To stop it,
/etc/init.d/cgred stop
The restart (stop,start), condrestart (same as restart, but only if the daemon
was already started), and status (print whether the daemon is started or
stopped) commands are also supported. An additional command, "reload", allows
you to reload the configuration file without stopping the daemon.
/etc/init.d/cgred reload
The cgred script automatically loads configuration from /etc/sysconfig/cgred.conf,
which is shipped as samples/cgred.conf. See that file for more information.
If you choose not to run the daemon as a service, the following options are
currently supported:
--nodaemon Do not run as a daemon
--nolog Write log output to stdout instead of a log file
--config [FILE] Read rules configuration from FILE instead of
/etc/cgrules.conf
You can ask the daemon to reload the configuration by sending it SIGUSR2. The
easiest way to do this is with the 'kill' command:
kill -s SIGUSR2 [PID]
TESTING
=======
The program setuid (found in tests/setuid.c) can help you test the daemon. By
default, this program attempts to change its UID to root and then idles until
you kill it. You can change the default behavior to use a different UID, or
you can uncomment the second block of code to instead attempt to change the
GID.
In order to make sure that everything works, I used the following rules:
sjo cpu default
cgtest cpu cgtest
% memory default
@cgroup cpu,memory cgtest
peter cpu test1
% memory test2
@root * default
* * default
The users 'sjo' and 'cgtest' were normal users. 'peter' is not a user on the
system. The group 'cgroup' is a group containing sjo,root,cgtest as members,
and the group 'root' contains only root. The cgroups 'default' and 'cgtest'
exist, while 'test1' and 'test2' do not. Currently, the daemon does not check
for the existance of 'test1', though this would be easier to do once the
parsing and caching logic is moved into libcgroup.
I ran the following tests, all of which were successful:
- set UID to sjo (should move cpu controller into default)
- set UID to root (should move cpu,memory controllers into cgtest)
- set UID to cgtest (should move cpu controller into cgtest, memory
controller into default)
- set GID to root (should move all controllers into default)
- set GID to cgroup (should move cpu, memory into cgtest)
- set GID to users (should move all controllers into default)
The parsing logic will skip the 'peter' rule as well as its multi-line
components (in this case "% memory test2"), because the user does not exist.
This should work for group rules, too. Attempting to setuid() or setgid() to a
user/group that doesn't exist will just return an error and not generate a
kernel event of the PROC_EVENT_UID or PROC_EVENT_GID type, so the daemon won't
do anything for it.
CONCERNS/ISSUES
===============
- Netlink can be unreliable, and the daemon might miss an event if the buffer
is full. One possible solution is to have one or two files that the kernel
can queue UID/GID changes in, and have the daemon read those files whenever
they are updated. From testing, this does not actually appear to be a real
problem, but it could become one with faster machines.
- The daemon does not care for namespaces at all, which can cause conflicts
with containers. If a user places his tasks into exec-based cgroups (such
as 'network' and 'development'), the daemon will not realize this and will
simply place them into the user's cgroup (so, sjo/ instead of sjo/network/).
CHANGELOG
=========
V9:
- Updated documentation, because it was very old and incorrect.
- Reverted the changes to cgexec and cgclassify.
- New API function: cgroup_change_cgroup_uid_gid_flags().
- Deprecated cgroup_change_cgroup_uid_gid().
- Rewrote some of the rule matching and execution logic in api.c to be
faster, and easier to read.
- Changes all negative return values to positive values. As a side effect,
cgroup_parse_rules() now returns -1 when we get a match and we are using
non-cached rules.
- Changes CGROUP_FUSECACHE to CGFLAG_USECACHE.
- Flags are now enumerated (cgflags), instead of #defines.
V8:
- Moved the event-handling logic back into the daemon, where it should be.
- Changed cgroup_parse_rules() to work with cached rules or non-cached rules.
The other parsing function is no longer needed, and should be deprecated.
- Non-cached rules now work with the same structs as cached rules.
- Modified cgroup_change_cgroup_uid_gid() with a new 'flags' parameter.
Currently, the only flag is "CGROUP_FUSECACHE" to use the cached rules logic
(or not).
- Added cgroup_rules_loaded() boolean, to check whether the cached rules have
been loaded yet, and cgroup_init_rules_cache() to load them.
- Modified cgexec and cgclassify to work with the new
cgroup_change_cgroup_uid_gid().
V7:
- Moved parsing and caching logic into libcgroup.
- Added locking mechanisms around the list of rules.
- Cleaned up #includes in cgrulesegnd.[h,c].
- Added notification if netlink receive queue overflows.
- Added logic to catch SIGINT in addition to SIGTERM.
- New API functions:
- cgroup_free_rule(struct cgroup_rule*)
- cgroup_free_rule_list(struct cgroup_rule_list*)
- cgroup_parse_rules(void)
- cgroup_print_rules_config(FILE*)
- cgroup_reload_cached_rules(void)
- cgroup_change_cgroup_event(struct proc_event*, int, FILE*)
V6:
- Wrote new parsing logic, which is cleaner and simpler.
- Added cgred script to enable using the daemon as a service.
- Wrote caching logic to cache rules table.
- Added the ability to force a reload of the rules table with SIGUSR2 signal.
- Added two structures to libcgroup: cgre_rule and cgre_rules_list
- New API function: cgroup_reload_cached_rules, which reloads the rules table.
- Added logging capabilities (default log is /root/cgrulesengd.conf)
TODO
====
- Find a way to replace Netlink, or at least clean up that code.
- Find a solution to the namespace problem.
|