DESCRIPTION =========== The CGroup Rules Engine Daemon is a tool that will automatically place tasks into the correct cgroup based on UID/GID events from the kernel. It will not automatically classify tasks that are already running, but it will classify any new tasks, and any tasks which change their UID/GID. Note that we use the euid and egid, not the ruid and rgid. Unlike other tools, cgrulesengd caches the rules configuration in a data structure (it's actually just a FIFO linked list) so that it doesn't need to parse the configuration file more than once. This should be much faster than parsing the rules for each UID/GID event. Eventually, this caching logic should be part of libcgroup, so that any other program can take advantage of it (and so that all programs are using the same table). The configuration can be reloaded without stopping the daemon (more information below). WHY A DAEMON? ============= A daemon is easy to use, and allows an administrator to ensure that all tasks are classified into the correct cgroups without constantly monitoring the system. The daemon is transparent to the users, and does not require any modifications to existing userspace programs. Finally, the daemon can be started and stopped at any time, including at boot time with other services. Thus, sytem administrators can decide not to use the daemon if they choose. Most importantly, some programs create new users and/or run scripts, threads, etc. as those users using suexec(). This call does not go through PAM, so these scripts would continue running in the same cgroup as the parent program. This behavior is likely not ideal, and the daemon would solve this problem. Apache does this. Apache creates a user called 'apache' and uses setuid() to launch tasks as that user. This does not go through PAM, so without a daemon, these tasks would continue to run in the 'root' cgroup rather than in the 'apache' or 'webserver' cgroup. The daemon fixes this problem by catching the setuid() call and moving the tasks into the correct cgroup. We would ask Apache to modify their software to interface with libcgroup, but this solution is less than optimal because a lot of userspace software would have to be changed, and some authors might intentionally not interact with libcgroup, which could create an exploit. The daemon is a simple, transparent solution. USING THE DAEMON ================ The daemon can be used as a service with the cgred script, which is shipped as scripts/init.d/cgred. This script should be installed as /etc/init.d/cgred and used like any other service. To start the daemon, /etc/init.d/cgred start To stop it, /etc/init.d/cgred stop The restart (stop,start), condrestart (same as restart, but only if the daemon was already started), and status (print whether the daemon is started or stopped) commands are also supported. An additional command, "flash", allows you to reload the configuration file without stopping the daemon. /etc/init.d/cgred flash The cgred script automatically loads configuration from /etc/sysconfig/cgred.conf, which is shipped as samples/cgred.conf. See that file for more information. If you choose not to run the daemon as a service, the following options are currently supported: --nodaemon Do not run as a daemon --nolog Write log output to stdout instead of a log file --config [FILE] Read rules configuration from FILE instead of /etc/cgrules.conf You can ask the daemon to reload the configuration by sending it SIGUSR2. The easiest way to do this is with the 'kill' command: kill -s SIGUSR2 [PID] TESTING ======= The program setuid (found in tests/setuid.c) can help you test the daemon. By default, this program attempts to change its UID to root and then idles until you kill it. You can change the default behavior to use a different UID, or you can uncomment the second block of code to instead attempt to change the GID. In order to make sure that everything works, I used the following rules: sjo cpu default cgtest cpu cgtest % memory default @cgroup cpu,memory cgtest peter cpu test1 % memory test2 @root * default * * default The users 'sjo' and 'cgtest' were normal users. 'peter' is not a user on the system. The group 'cgroup' is a group containing sjo,root,cgtest as members, and the group 'root' contains only root. The cgroups 'default' and 'cgtest' exist, while 'test1' and 'test2' do not. Currently, the daemon does not check for the existance of 'test1', though this would be easier to do once the parsing and caching logic is moved into libcgroup. I ran the following tests, all of which were successful: - set UID to sjo (should move cpu controller into default) - set UID to root (should move cpu,memory controllers into cgtest) - set UID to cgtest (should move cpu controller into cgtest, memory controller into default) - set GID to root (should move all controllers into default) - set GID to cgroup (should move cpu, memory into cgtest) - set GID to users (should move all controllers into default) The parsing logic will skip the 'peter' rule as well as its multi-line components (in this case "% memory test2"), because the user does not exist. This should work for group rules, too. Attempting to setuid() or setgid() to a user/group that doesn't exist will just return an error and not generate a kernel event of the PROC_EVENT_UID or PROC_EVENT_GID type, so the daemon won't do anything for it. CONCERNS/ISSUES =============== - Netlink can be unreliable, and the daemon might miss an event if the buffer is full. One possible solution is to have one or two files that the kernel can queue UID/GID changes in, and have the daemon read those files whenever they are updated. From testing, this does not actually appear to be a real problem, but it could become one with faster machines. - The daemon does not care for namespaces at all, which can cause conflicts with containers. If a user places his tasks into exec-based cgroups (such as 'network' and 'development'), the daemon will not realize this and will simply place them into the user's cgroup (so, sjo/ instead of sjo/network/). CHANGELOG ========= V9: - Updated documentation, because it was very old and incorrect. - Reverted the changes to cgexec and cgclassify. - New API function: cgroup_change_cgroup_uid_gid_flags(). - Deprecated cgroup_change_cgroup_uid_gid(). - Rewrote some of the rule matching and execution logic in api.c to be faster, and easier to read. - Changes all negative return values to positive values. As a side effect, cgroup_parse_rules() now returns -1 when we get a match and we are using non-cached rules. - Changes CGROUP_FUSECACHE to CGFLAG_USECACHE. - Flags are now enumerated (cgflags), instead of #defines. V8: - Moved the event-handling logic back into the daemon, where it should be. - Changed cgroup_parse_rules() to work with cached rules or non-cached rules. The other parsing function is no longer needed, and should be deprecated. - Non-cached rules now work with the same structs as cached rules. - Modified cgroup_change_cgroup_uid_gid() with a new 'flags' parameter. Currently, the only flag is "CGROUP_FUSECACHE" to use the cached rules logic (or not). - Added cgroup_rules_loaded() boolean, to check whether the cached rules have been loaded yet, and cgroup_init_rules_cache() to load them. - Modified cgexec and cgclassify to work with the new cgroup_change_cgroup_uid_gid(). V7: - Moved parsing and caching logic into libcgroup. - Added locking mechanisms around the list of rules. - Cleaned up #includes in cgrulesegnd.[h,c]. - Added notification if netlink receive queue overflows. - Added logic to catch SIGINT in addition to SIGTERM. - New API functions: - cgroup_free_rule(struct cgroup_rule*) - cgroup_free_rule_list(struct cgroup_rule_list*) - cgroup_parse_rules(void) - cgroup_print_rules_config(FILE*) - cgroup_reload_cached_rules(void) - cgroup_change_cgroup_event(struct proc_event*, int, FILE*) V6: - Wrote new parsing logic, which is cleaner and simpler. - Added cgred script to enable using the daemon as a service. - Wrote caching logic to cache rules table. - Added the ability to force a reload of the rules table with SIGUSR2 signal. - Added two structures to libcgroup: cgre_rule and cgre_rules_list - New API function: cgroup_reload_cached_rules, which reloads the rules table. - Added logging capabilities (default log is /root/cgrulesengd.conf) TODO ==== - Find a way to replace Netlink, or at least clean up that code. - Find a solution to the namespace problem.