diff options
Diffstat (limited to 'README_daemon')
-rw-r--r-- | README_daemon | 178 |
1 files changed, 178 insertions, 0 deletions
diff --git a/README_daemon b/README_daemon new file mode 100644 index 0000000..1ff7a33 --- /dev/null +++ b/README_daemon @@ -0,0 +1,178 @@ +DESCRIPTION +=========== +The CGroup Rules Engine Daemon is a tool that will automatically place tasks +into the correct cgroup based on UID/GID events from the kernel. It will not +automatically classify tasks that are already running, but it will classify +any new tasks, and any tasks which change their UID/GID. Note that we use +the euid and egid, not the ruid and rgid. + +Unlike other tools, cgrulesengd caches the rules configuration in a data +structure (it's actually just a FIFO linked list) so that it doesn't need to +parse the configuration file more than once. This should be much faster than +parsing the rules for each UID/GID event. Eventually, this caching logic +should be part of libcgroup, so that any other program can take advantage of it +(and so that all programs are using the same table). The configuration can +be reloaded without stopping the daemon (more information below). + +WHY A DAEMON? +============= +A daemon is easy to use, and allows an administrator to ensure that all tasks +are classified into the correct cgroups without constantly monitoring the +system. The daemon is transparent to the users, and does not require any +modifications to existing userspace programs. Finally, the daemon can be +started and stopped at any time, including at boot time with other services. +Thus, sytem administrators can decide not to use the daemon if they choose. + +Most importantly, some programs create new users and/or run scripts, +threads, etc. as those users using suexec(). This call does not go through +PAM, so these scripts would continue running in the same cgroup as the parent +program. This behavior is likely not ideal, and the daemon would solve this +problem. + +Apache does this. Apache creates a user called 'apache' and uses setuid() to +launch tasks as that user. This does not go through PAM, so without a daemon, +these tasks would continue to run in the 'root' cgroup rather than in the +'apache' or 'webserver' cgroup. The daemon fixes this problem by catching the +setuid() call and moving the tasks into the correct cgroup. + +We would ask Apache to modify their software to interface with libcgroup, but +this solution is less than optimal because a lot of userspace software would +have to be changed, and some authors might intentionally not interact with +libcgroup, which could create an exploit. The daemon is a simple, transparent +solution. + +USING THE DAEMON +================ +The daemon can be used as a service with the cgred script, which is shipped +as scripts/init.d/cgred. This script should be installed as /etc/init.d/cgred +and used like any other service. To start the daemon, + /etc/init.d/cgred start +To stop it, + /etc/init.d/cgred stop +The restart (stop,start), condrestart (same as restart, but only if the daemon +was already started), and status (print whether the daemon is started or +stopped) commands are also supported. An additional command, "flash", allows +you to reload the configuration file without stopping the daemon. + /etc/init.d/cgred flash +The cgred script automatically loads configuration from /etc/cgred.d/cgred.conf, +which is shipped as samples/cgred.conf. See that file for more information. + +If you choose not to run the daemon as a service, the following options are +currently supported: + --nodaemon Do not run as a daemon + --nolog Write log output to stdout instead of a log file + --config [FILE] Read rules configuration from FILE instead of + /etc/cgrules.conf + +You can ask the daemon to reload the configuration by sending it SIGUSR2. The +easiest way to do this is with the 'kill' command: + kill -s SIGUSR2 [PID] + +TESTING +======= +The program setuid (found in tests/setuid.c) can help you test the daemon. By +default, this program attempts to change its UID to root and then idles until +you kill it. You can change the default behavior to use a different UID, or +you can uncomment the second block of code to instead attempt to change the +GID. + +In order to make sure that everything works, I used the following rules: + sjo cpu default + cgtest cpu cgtest + % memory default + @cgroup cpu,memory cgtest + peter cpu test1 + % memory test2 + @root * default + * * default + +The users 'sjo' and 'cgtest' were normal users. 'peter' is not a user on the +system. The group 'cgroup' is a group containing sjo,root,cgtest as members, +and the group 'root' contains only root. The cgroups 'default' and 'cgtest' +exist, while 'test1' and 'test2' do not. Currently, the daemon does not check +for the existance of 'test1', though this would be easier to do once the +parsing and caching logic is moved into libcgroup. + +I ran the following tests, all of which were successful: + - set UID to sjo (should move cpu controller into default) + - set UID to root (should move cpu,memory controllers into cgtest) + - set UID to cgtest (should move cpu controller into cgtest, memory + controller into default) + - set GID to root (should move all controllers into default) + - set GID to cgroup (should move cpu, memory into cgtest) + - set GID to users (should move all controllers into default) + +The parsing logic will skip the 'peter' rule as well as its multi-line +components (in this case "% memory test2"), because the user does not exist. +This should work for group rules, too. Attempting to setuid() or setgid() to a +user/group that doesn't exist will just return an error and not generate a +kernel event of the PROC_EVENT_UID or PROC_EVENT_GID type, so the daemon won't +do anything for it. + +CONCERNS/ISSUES +=============== + - Netlink can be unreliable, and the daemon might miss an event if the buffer + is full. One possible solution is to have one or two files that the kernel + can queue UID/GID changes in, and have the daemon read those files whenever + they are updated. From testing, this does not actually appear to be a real + problem, but it could become one with faster machines. + - The daemon does not care for namespaces at all, which can cause conflicts + with containers. If a user places his tasks into exec-based cgroups (such + as 'network' and 'development'), the daemon will not realize this and will + simply place them into the user's cgroup (so, sjo/ instead of sjo/network/). + +CHANGELOG +========= +V9: + - Updated documentation, because it was very old and incorrect. + - Reverted the changes to cgexec and cgclassify. + - New API function: cgroup_change_cgroup_uid_gid_flags(). + - Deprecated cgroup_change_cgroup_uid_gid(). + - Rewrote some of the rule matching and execution logic in api.c to be + faster, and easier to read. + - Changes all negative return values to positive values. As a side effect, + cgroup_parse_rules() now returns -1 when we get a match and we are using + non-cached rules. + - Changes CGROUP_FUSECACHE to CGFLAG_USECACHE. + - Flags are now enumerated (cgflags), instead of #defines. + +V8: + - Moved the event-handling logic back into the daemon, where it should be. + - Changed cgroup_parse_rules() to work with cached rules or non-cached rules. + The other parsing function is no longer needed, and should be deprecated. + - Non-cached rules now work with the same structs as cached rules. + - Modified cgroup_change_cgroup_uid_gid() with a new 'flags' parameter. + Currently, the only flag is "CGROUP_FUSECACHE" to use the cached rules logic + (or not). + - Added cgroup_rules_loaded() boolean, to check whether the cached rules have + been loaded yet, and cgroup_init_rules_cache() to load them. + - Modified cgexec and cgclassify to work with the new + cgroup_change_cgroup_uid_gid(). + +V7: + - Moved parsing and caching logic into libcgroup. + - Added locking mechanisms around the list of rules. + - Cleaned up #includes in cgrulesegnd.[h,c]. + - Added notification if netlink receive queue overflows. + - Added logic to catch SIGINT in addition to SIGTERM. + - New API functions: + - cgroup_free_rule(struct cgroup_rule*) + - cgroup_free_rule_list(struct cgroup_rule_list*) + - cgroup_parse_rules(void) + - cgroup_print_rules_config(FILE*) + - cgroup_reload_cached_rules(void) + - cgroup_change_cgroup_event(struct proc_event*, int, FILE*) + +V6: + - Wrote new parsing logic, which is cleaner and simpler. + - Added cgred script to enable using the daemon as a service. + - Wrote caching logic to cache rules table. + - Added the ability to force a reload of the rules table with SIGUSR2 signal. + - Added two structures to libcgroup: cgre_rule and cgre_rules_list + - New API function: cgroup_reload_cached_rules, which reloads the rules table. + - Added logging capabilities (default log is /root/cgrulesengd.conf) + +TODO +==== + - Find a way to replace Netlink, or at least clean up that code. + - Find a solution to the namespace problem. |