summaryrefslogtreecommitdiffstats
path: root/security
Commit message (Collapse)AuthorAgeFilesLines
* kobject: convert kernel_kset to be a kobjectGreg Kroah-Hartman2008-01-241-1/+1
| | | | | | | | | | | | kernel_kset does not need to be a kset, but a much simpler kobject now that we have kobj_attributes. We also rename kernel_kset to kernel_kobj to catch all users of this symbol with a build error instead of an easy-to-ignore build warning. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* kset: convert kernel_subsys to use kset_createGreg Kroah-Hartman2008-01-241-1/+1
| | | | | | | | | | Dynamically create the kset instead of declaring it statically. We also rename kernel_subsys to kernel_kset to catch all users of this symbol with a build error instead of an easy-to-ignore build warning. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* kobject: convert securityfs to use kobject_createGreg Kroah-Hartman2008-01-241-6/+5
| | | | | | | | | | We don't need a kset here, a simple kobject will do just fine, so dynamically create the kobject and use it. Cc: Kay Sievers <kay.sievers@vrfy.org> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* kobject: remove struct kobj_type from struct ksetGreg Kroah-Hartman2008-01-241-2/+2
| | | | | | | | | | | | | | | | | We don't need a "default" ktype for a kset. We should set this explicitly every time for each kset. This change is needed so that we can make ksets dynamic, and cleans up one of the odd, undocumented assumption that the kset/kobject/ktype model has. This patch is based on a lot of help from Kay Sievers. Nasty bug in the block code was found by Dave Young <hidave.darkstar@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* Merge branch 'for-linus' of ↵Linus Torvalds2008-01-212-4/+5
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: selinux: fix memory leak in netlabel code
| * selinux: fix memory leak in netlabel codePaul Moore2008-01-222-4/+5
| | | | | | | | | | | | | | | | Fix a memory leak in security_netlbl_sid_to_secattr() as reported here: * https://bugzilla.redhat.com/show_bug.cgi?id=352281 Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
* | Fix filesystem capability supportAndrew G. Morgan2008-01-211-3/+10
|/ | | | | | | | | | | | | | | | | | | | In linux-2.6.24-rc1, security/commoncap.c:cap_inh_is_capped() was introduced. It has the exact reverse of its intended behavior. This led to an unintended privilege esculation involving a process' inheritable capability set. To be exposed to this bug, you need to have Filesystem Capabilities enabled and in use. That is: - CONFIG_SECURITY_FILE_CAPABILITIES must be defined for the buggy code to be compiled in. - You also need to have files on your system marked with fI bits raised. Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@akpm@linux-foundation.org>
* Security: allow capable check to permit mmap or low vm spaceEric Paris2007-12-061-1/+1
| | | | | | | | | | | | | | On a kernel with CONFIG_SECURITY but without an LSM which implements security_file_mmap it is impossible for an application to mmap addresses lower than mmap_min_addr. Based on a suggestion from a developer in the openwall community this patch adds a check for CAP_SYS_RAWIO. It is assumed that any process with this capability can harm the system a lot more easily than writing some stuff on the zero page and then trying to get the kernel to trip over itself. It also means that programs like X on i686 which use vm86 emulation can work even with mmap_min_addr set. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: detect dead booleansStephen Smalley2007-12-061-13/+30
| | | | | | | | | | | | Instead of using f_op to detect dead booleans, check the inode index against the number of booleans and check the dentry name against the boolean name for that index on reads and writes. This prevents incorrect use of a boolean file opened prior to a policy reload while allowing valid use of it as long as it still corresponds to the same boolean in the policy. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: do not clear f_op when removing entriesStephen Smalley2007-12-061-27/+1
| | | | | | | Do not clear f_op when removing entries since it isn't safe to do. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* file capabilities: don't prevent signaling setuid root programsSerge E. Hallyn2007-11-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | An unprivileged process must be able to kill a setuid root program started by the same user. This is legacy behavior needed for instance for xinit to kill X when the window manager exits. When an unprivileged user runs a setuid root program in !SECURE_NOROOT mode, fP, fI, and fE are set full on, so pP' and pE' are full on. Then cap_task_kill() prevents the user from signaling the setuid root task. This is a change in behavior compared to when !CONFIG_SECURITY_FILE_CAPABILITIES. This patch introduces a special check into cap_task_kill() just to check whether a non-root user is signaling a setuid root program started by the same user. If so, then signal is allowed. Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Andrew Morgan <morgan@kernel.org> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Cc: Chris Wright <chrisw@sous-sol.org> Cc: James Morris <jmorris@namei.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* file capabilities: allow sigcont within sessionSerge E. Hallyn2007-11-141-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Fix http://bugzilla.kernel.org/show_bug.cgi?id=9247 Allow sigcont to be sent to a process with greater capabilities if it is in the same session. Otherwise, a shell from which I've started a root shell and done 'suspend' can't be restarted by the parent shell. Also don't do file-capabilities signaling checks when uids for the processes don't match, since the standard check_kill_permission will have done those checks. [akpm@linux-foundation.org: coding-style cleanups] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: Andrew Morgan <morgan@kernel.org> Cc: Chris Wright <chrisw@sous-sol.org> Tested-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Stephen Smalley <sds@epoch.ncsc.mil> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Chris Wright <chrisw@sous-sol.org> Cc: James Morris <jmorris@namei.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* SELinux: add more validity checks on policy loadStephen Smalley2007-11-087-38/+118
| | | | | | | | | | Add more validity checks at policy load time to reject malformed policies and prevent subsequent out-of-range indexing when in permissive mode. Resolves the NULL pointer dereference reported in https://bugzilla.redhat.com/show_bug.cgi?id=357541. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: fix bug in new ebitmap code.KaiGai Kohei2007-11-081-1/+1
| | | | | | | | | The "e_iter = e_iter->next;" statement in the inner for loop is primally bug. It should be moved to outside of the for loop. Signed-off-by: KaiGai Kohei <kaigai@kaigai.gr.jp> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: suppress a warning for 64k pages.Stephen Rothwell2007-11-081-6/+7
| | | | | | | | | On PowerPC allmodconfig build we get this: security/selinux/xfrm.c:214: warning: comparison is always false due to limited range of data type Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: James Morris <jmorris@namei.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2007-10-231-5/+1
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: SELinux: always check SIGCHLD in selinux_task_wait
| * SELinux: always check SIGCHLD in selinux_task_waitEric Paris2007-10-231-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When checking if we can wait on a child we were looking at p->exit_signal and trying to make the decision based on if the signal would eventually be allowed. One big flaw is that p->exit_signal is -1 for NPTL threads and so aignal_to_av was not actually checking SIGCHLD which is what would have been sent. Even is exit_signal was set to something strange it wouldn't change the fact that the child was there and needed to be waited on. This patch just assumes wait is based on SIGCHLD. Specific permission checks are made when the child actually attempts to send a signal. This resolves the problem of things like using GDB on confined domains such as in RH BZ 232371. The confined domain did not have permission to send a generic signal (exit_signal == -1) back to the unconfined GDB. With this patch the GDB wait works and since the actual signal sent is allowed everything functions as it should. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
* | capabilities: clean up file capability readingSerge E. Hallyn2007-10-221-8/+15
|/ | | | | | | | | | | | | | | Simplify the vfs_cap_data structure. Also fix get_file_caps which was declaring __le32 v1caps[XATTR_CAPS_SZ] on the stack, but XATTR_CAPS_SZ is already * sizeof(__le32). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Andrew Morgan <morgan@kernel.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pid namespaces: define is_global_init() and is_container_init()Serge E. Hallyn2007-10-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | is_init() is an ambiguous name for the pid==1 check. Split it into is_global_init() and is_container_init(). A cgroup init has it's tsk->pid == 1. A global init also has it's tsk->pid == 1 and it's active pid namespace is the init_pid_ns. But rather than check the active pid namespace, compare the task structure with 'init_pid_ns.child_reaper', which is initialized during boot to the /sbin/init process and never changes. Changelog: 2.6.22-rc4-mm2-pidns1: - Use 'init_pid_ns.child_reaper' to determine if a given task is the global init (/sbin/init) process. This would improve performance and remove dependence on the task_pid(). 2.6.21-mm2-pidns2: - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc, ppc,avr32}/traps.c for the _exception() call to is_global_init(). This way, we kill only the cgroup if the cgroup's init has a bug rather than force a kernel panic. [akpm@linux-foundation.org: fix comment] [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c] [bunk@stusta.de: kernel/pid.c: remove unused exports] [sukadev@us.ibm.com: Fix capability.c to work with threaded init] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* sparse pointer use of zero as nullStephen Hemminger2007-10-181-1/+1
| | | | | | | | | | | | | | | | | Get rid of sparse related warnings from places that use integer as NULL pointer. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Jeff Garzik <jeff@garzik.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Ian Kent <raven@themaw.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* V3 file capabilities: alter behavior of cap_setpcapAndrew Morgan2007-10-182-14/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1, can change the capabilities of another process, p2. This is not the meaning that was intended for this capability at all, and this implementation came about purely because, without filesystem capabilities, there was no way to use capabilities without one process bestowing them on another. Since we now have a filesystem support for capabilities we can fix the implementation of CAP_SETPCAP. The most significant thing about this change is that, with it in effect, no process can set the capabilities of another process. The capabilities of a program are set via the capability convolution rules: pI(post-exec) = pI(pre-exec) pP(post-exec) = (X(aka cap_bset) & fP) | (pI(post-exec) & fI) pE(post-exec) = fE ? pP(post-exec) : 0 at exec() time. As such, the only influence the pre-exec() program can have on the post-exec() program's capabilities are through the pI capability set. The correct implementation for CAP_SETPCAP (and that enabled by this patch) is that it can be used to add extra pI capabilities to the current process - to be picked up by subsequent exec()s when the above convolution rules are applied. Here is how it works: Let's say we have a process, p. It has capability sets, pE, pP and pI. Generally, p, can change the value of its own pI to pI' where (pI' & ~pI) & ~pP = 0. That is, the only new things in pI' that were not present in pI need to be present in pP. The role of CAP_SETPCAP is basically to permit changes to pI beyond the above: if (pE & CAP_SETPCAP) { pI' = anything; /* ie., even (pI' & ~pI) & ~pP != 0 */ } This capability is useful for things like login, which (say, via pam_cap) might want to raise certain inheritable capabilities for use by the children of the logged-in user's shell, but those capabilities are not useful to or needed by the login program itself. One such use might be to limit who can run ping. You set the capabilities of the 'ping' program to be "= cap_net_raw+i", and then only shells that have (pI & CAP_NET_RAW) will be able to run it. Without CAP_SETPCAP implemented as described above, login(pam_cap) would have to also have (pP & CAP_NET_RAW) in order to raise this capability and pass it on through the inheritable set. Signed-off-by: Andrew Morgan <morgan@kernel.org> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* security/ cleanupsAdrian Bunk2007-10-175-118/+1
| | | | | | | | | | | | | | | | This patch contains the following cleanups that are now possible: - remove the unused security_operations->inode_xattr_getsuffix - remove the no longer used security_operations->unregister_security - remove some no longer required exit code - remove a bunch of no longer used exports Signed-off-by: Adrian Bunk <bunk@kernel.org> Acked-by: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Serge Hallyn <serue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Implement file posix capabilitiesSerge E. Hallyn2007-10-176-43/+313
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement file posix capabilities. This allows programs to be given a subset of root's powers regardless of who runs them, without having to use setuid and giving the binary all of root's powers. This version works with Kaigai Kohei's userspace tools, found at http://www.kaigai.gr.jp/index.php. For more information on how to use this patch, Chris Friedhoff has posted a nice page at http://www.friedhoff.org/fscaps.html. Changelog: Nov 27: Incorporate fixes from Andrew Morton (security-introduce-file-caps-tweaks and security-introduce-file-caps-warning-fix) Fix Kconfig dependency. Fix change signaling behavior when file caps are not compiled in. Nov 13: Integrate comments from Alexey: Remove CONFIG_ ifdef from capability.h, and use %zd for printing a size_t. Nov 13: Fix endianness warnings by sparse as suggested by Alexey Dobriyan. Nov 09: Address warnings of unused variables at cap_bprm_set_security when file capabilities are disabled, and simultaneously clean up the code a little, by pulling the new code into a helper function. Nov 08: For pointers to required userspace tools and how to use them, see http://www.friedhoff.org/fscaps.html. Nov 07: Fix the calculation of the highest bit checked in check_cap_sanity(). Nov 07: Allow file caps to be enabled without CONFIG_SECURITY, since capabilities are the default. Hook cap_task_setscheduler when !CONFIG_SECURITY. Move capable(TASK_KILL) to end of cap_task_kill to reduce audit messages. Nov 05: Add secondary calls in selinux/hooks.c to task_setioprio and task_setscheduler so that selinux and capabilities with file cap support can be stacked. Sep 05: As Seth Arnold points out, uid checks are out of place for capability code. Sep 01: Define task_setscheduler, task_setioprio, cap_task_kill, and task_setnice to make sure a user cannot affect a process in which they called a program with some fscaps. One remaining question is the note under task_setscheduler: are we ok with CAP_SYS_NICE being sufficient to confine a process to a cpuset? It is a semantic change, as without fsccaps, attach_task doesn't allow CAP_SYS_NICE to override the uid equivalence check. But since it uses security_task_setscheduler, which elsewhere is used where CAP_SYS_NICE can be used to override the uid equivalence check, fixing it might be tough. task_setscheduler note: this also controls cpuset:attach_task. Are we ok with CAP_SYS_NICE being used to confine to a cpuset? task_setioprio task_setnice sys_setpriority uses this (through set_one_prio) for another process. Need same checks as setrlimit Aug 21: Updated secureexec implementation to reflect the fact that euid and uid might be the same and nonzero, but the process might still have elevated caps. Aug 15: Handle endianness of xattrs. Enforce capability version match between kernel and disk. Enforce that no bits beyond the known max capability are set, else return -EPERM. With this extra processing, it may be worth reconsidering doing all the work at bprm_set_security rather than d_instantiate. Aug 10: Always call getxattr at bprm_set_security, rather than caching it at d_instantiate. [morgan@kernel.org: file-caps clean up for linux/capability.h] [bunk@kernel.org: unexport cap_inode_killpriv] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Andrew Morgan <morgan@kernel.org> Signed-off-by: Andrew Morgan <morgan@kernel.org> Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* security: Convert LSM into a static interfaceJames Morris2007-10-178-71/+961
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert LSM into a static interface, as the ability to unload a security module is not required by in-tree users and potentially complicates the overall security architecture. Needlessly exported LSM symbols have been unexported, to help reduce API abuse. Parameters for the capability and root_plug modules are now specified at boot. The SECURITY_FRAMEWORK_VERSION macro has also been removed. In a nutshell, there is no safe way to unload an LSM. The modular interface is thus unecessary and broken infrastructure. It is used only by out-of-tree modules, which are often binary-only, illegal, abusive of the API and dangerous, e.g. silently re-vectoring SELinux. [akpm@linux-foundation.org: cleanups] [akpm@linux-foundation.org: USB Kconfig fix] [randy.dunlap@oracle.com: fix LSM kernel-doc] Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Chris Wright <chrisw@sous-sol.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Acked-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* KEYS: Make request_key() and co fundamentally asynchronousDavid Howells2007-10-175-317/+335
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make request_key() and co fundamentally asynchronous to make it easier for NFS to make use of them. There are now accessor functions that do asynchronous constructions, a wait function to wait for construction to complete, and a completion function for the key type to indicate completion of construction. Note that the construction queue is now gone. Instead, keys under construction are linked in to the appropriate keyring in advance, and that anyone encountering one must wait for it to be complete before they can use it. This is done automatically for userspace. The following auxiliary changes are also made: (1) Key type implementation stuff is split from linux/key.h into linux/key-type.h. (2) AF_RXRPC provides a way to allocate null rxrpc-type keys so that AFS does not need to call key_instantiate_and_link() directly. (3) Adjust the debugging macros so that they're -Wformat checked even if they are disabled, and make it so they can be enabled simply by defining __KDEBUG to be consistent with other code of mine. (3) Documentation. [alan@lxorguk.ukuu.org.uk: keys: missing word in documentation] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* SELinux: kills warnings in Improve SELinux performance when AVC missesKaiGai Kohei2007-10-172-6/+7
| | | | | | | | This patch kills ugly warnings when the "Improve SELinux performance when ACV misses" patch. Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: improve performance when AVC misses.KaiGai Kohei2007-10-174-237/+303
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * We add ebitmap_for_each_positive_bit() which enables to walk on any positive bit on the given ebitmap, to improve its performance using common bit-operations defined in linux/bitops.h. In the previous version, this logic was implemented using a combination of ebitmap_for_each_bit() and ebitmap_node_get_bit(), but is was worse in performance aspect. This logic is most frequestly used to compute a new AVC entry, so this patch can improve SELinux performance when AVC misses are happen. * struct ebitmap_node is redefined as an array of "unsigned long", to get suitable for using find_next_bit() which is fasted than iteration of shift and logical operation, and to maximize memory usage allocated from general purpose slab. * Any ebitmap_for_each_bit() are repleced by the new implementation in ss/service.c and ss/mls.c. Some of related implementation are changed, however, there is no incompatibility with the previous version. * The width of any new line are less or equal than 80-chars. The following benchmark shows the effect of this patch, when we access many files which have different security context one after another. The number is more than /selinux/avc/cache_threshold, so any access always causes AVC misses. selinux-2.6 selinux-2.6-ebitmap AVG: 22.763 [s] 8.750 [s] STD: 0.265 0.019 ------------------------------------------ 1st: 22.558 [s] 8.786 [s] 2nd: 22.458 [s] 8.750 [s] 3rd: 22.478 [s] 8.754 [s] 4th: 22.724 [s] 8.745 [s] 5th: 22.918 [s] 8.748 [s] 6th: 22.905 [s] 8.764 [s] 7th: 23.238 [s] 8.726 [s] 8th: 22.822 [s] 8.729 [s] Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: policy selectable handling of unknown classes and permsEric Paris2007-10-175-9/+106
| | | | | | | | | | | | | | | | | | Allow policy to select, in much the same way as it selects MLS support, how the kernel should handle access decisions which contain either unknown classes or unknown permissions in known classes. The three choices for the policy flags are 0 - Deny unknown security access. (default) 2 - reject loading policy if it does not contain all definitions 4 - allow unknown security access The policy's choice is exported through 2 booleans in selinuxfs. /selinux/deny_unknown and /selinux/reject_unknown. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: Improve read/write performanceYuichi Nakamura2007-10-175-1/+67
| | | | | | | | | | | | | | It reduces the selinux overhead on read/write by only revalidating permissions in selinux_file_permission if the task or inode labels have changed or the policy has changed since the open-time check. A new LSM hook, security_dentry_open, is added to capture the necessary state at open time to allow this optimization. (see http://marc.info/?l=selinux&m=118972995207740&w=2) Signed-off-by: Yuichi Nakamura<ynakam@hitachisoft.jp> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: tune avtab to reduce memory usageYuichi Nakamura2007-10-174-36/+82
| | | | | | | | | | | | This patch reduces memory usage of SELinux by tuning avtab. Number of hash slots in avtab was 32768. Unused slots used memory when number of rules is fewer. This patch decides number of hash slots dynamically based on number of rules. (chain length)^2 is also printed out in avtab_hash_eval to see standard deviation of avtab hash table. Signed-off-by: Yuichi Nakamura<ynakam@hitachisoft.jp> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* [SELINUX]: Update for netfilter ->hook() arg changes.David S. Miller2007-10-151-6/+5
| | | | | | They take a "struct sk_buff *" instead of a "struct sk_buff **" now. Signed-off-by: David S. Miller <davem@davemloft.net>
* [INET]: local port range robustnessStephen Hemminger2007-10-101-17/+22
| | | | | | | | | | | | | | | Expansion of original idea from Denis V. Lunev <den@openvz.org> Add robustness and locking to the local_port_range sysctl. 1. Enforce that low < high when setting. 2. Use seqlock to ensure atomic update. The locking might seem like overkill, but there are cases where sysadmin might want to change value in the middle of a DoS attack. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Support multiple network namespaces with netlinkEric W. Biederman2007-10-101-2/+3
| | | | | | | | | | | | | | | | | | | | | Each netlink socket will live in exactly one network namespace, this includes the controlling kernel sockets. This patch updates all of the existing netlink protocols to only support the initial network namespace. Request by clients in other namespaces will get -ECONREFUSED. As they would if the kernel did not have the support for that netlink protocol compiled in. As each netlink protocol is updated to be multiple network namespace safe it can register multiple kernel sockets to acquire a presence in the rest of the network namespaces. The implementation in af_netlink is a simple filter implementation at hash table insertion and hash table look up time. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Make device event notification network namespace safeEric W. Biederman2007-10-101-0/+4
| | | | | | | | | | | | | | | | | | Every user of the network device notifiers is either a protocol stack or a pseudo device. If a protocol stack that does not have support for multiple network namespaces receives an event for a device that is not in the initial network namespace it quite possibly can get confused and do the wrong thing. To avoid problems until all of the protocol stacks are converted this patch modifies all netdev event handlers to ignore events on devices that are not in the initial network namespace. As the rest of the code is made network namespace aware these checks can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* SELinux: fix array out of bounds when mounting with selinux optionsEric Paris2007-09-201-0/+2
| | | | | | | | | | | | Given an illegal selinux option it was possible for match_token to work in random memory at the end of the match_table_t array. Note that privilege is required to perform a context mount, so this issue is effectively limited to root only. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: clear parent death signal on SID transitionsStephen Smalley2007-08-301-0/+3
| | | | | | | | | Clear parent death signal on SID transitions to prevent unauthorized signaling between SIDs. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Eric Paris <eparis@parisplace.org> Signed-off-by: James Morris <jmorris@localhost.localdomain>
* fix NULL pointer dereference in __vm_enough_memory()Alan Cox2007-08-223-6/+6
| | | | | | | | | | | | | | | | | | | | The new exec code inserts an accounted vma into an mm struct which is not current->mm. The existing memory check code has a hard coded assumption that this does not happen as does the security code. As the correct mm is known we pass the mm to the security method and the helper function. A new security test is added for the case where we need to pass the mm and the existing one is modified to pass current->mm to avoid the need to change large amounts of code. (Thanks to Tobias for fixing rejects and testing) Signed-off-by: Alan Cox <alan@redhat.com> Cc: WU Fengguang <wfg@mail.ustc.edu.cn> Cc: James Morris <jmorris@redhat.com> Cc: Tobias Diedrich <ranma+kernel@tdiedrich.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* SELinux: correct error code in selinux_audit_rule_initSteve G2007-08-161-1/+1
| | | | | | | Corrects an error code so that it is valid to pass to userspace. Signed-off-by: Steve Grubb <linux_4ever@yahoo.com> Signed-off-by: James Morris <jmorris@halo.namei>
* SELinux: remove redundant pointer checks before calling kfree()Paul Moore2007-08-021-2/+1
| | | | | | | | We don't need to check for NULL pointers before calling kfree(). Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: restore proper NetLabel caching behaviorPaul Moore2007-08-021-4/+12
| | | | | | | | | | A small fix to the SELinux/NetLabel glue code to ensure that the NetLabel cache is utilized when possible. This was broken when the SELinux/NetLabel glue code was reorganized in the last kernel release. Signed-off-by: Paul Moore <paul.moore@hp.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* Typo fixes errror -> errorGabriel Craciunescu2007-07-311-1/+1
| | | | | | | | | | | Typo fixes errror -> error Signed-off-by: Gabriel Craciunescu <nix.or.die@googlemail.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* SELinux: null-terminate context string in selinux_xfrm_sec_ctx_allocVenkat Yekkirala2007-07-251-1/+2
| | | | | | | | | xfrm_audit_log() expects the context string to be null-terminated which currently doesn't happen with user-supplied contexts. Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: James Morris <jmorris@namei.org>
* SELinux: fix memory leak in security_netlbl_cache_add()Jesper Juhl2007-07-231-1/+3
| | | | | | | | | Fix memory leak in security_netlbl_cache_add() Note: The Coverity checker gets credit for spotting this one. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: James Morris <jmorris@namei.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
* [PATCH] get rid of AVC_PATH postponed treatmentAl Viro2007-07-221-7/+8
| | | | | | | | | | | | | | | | | Selinux folks had been complaining about the lack of AVC_PATH records when audit is disabled. I must admit my stupidity - I assumed that avc_audit() really couldn't use audit_log_d_path() because of deadlocks (== could be called with dcache_lock or vfsmount_lock held). Shouldn't have made that assumption - it never gets called that way. It _is_ called under spinlocks, but not those. Since audit_log_d_path() uses ab->gfp_mask for allocations, kmalloc() in there is not a problem. IOW, the simple fix is sufficient: let's rip AUDIT_AVC_PATH out and simply generate pathname as part of main record. It's trivial to do. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: James Morris <jmorris@namei.org>
* mm: Remove slab destructors from kmem_cache_create().Paul Mundt2007-07-204-4/+4
| | | | | | | | | | | | | | Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2007-07-192-31/+39
|\ | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: SELinux: use SECINITSID_NETMSG instead of SECINITSID_UNLABELED for NetLabel SELinux: enable dynamic activation/deactivation of NetLabel/SELinux enforcement
| * SELinux: use SECINITSID_NETMSG instead of SECINITSID_UNLABELED for NetLabelPaul Moore2007-07-192-31/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These changes will make NetLabel behave like labeled IPsec where there is an access check for both labeled and unlabeled packets as well as providing the ability to restrict domains to receiving only labeled packets when NetLabel is in use. The changes to the policy are straight forward with the following necessary to receive labeled traffic (with SECINITSID_NETMSG defined as "netlabel_peer_t"): allow mydom_t netlabel_peer_t:{ tcp_socket udp_socket rawip_socket } recvfrom; The policy for unlabeled traffic would be: allow mydom_t unlabeled_t:{ tcp_socket udp_socket rawip_socket } recvfrom; These policy changes, as well as more general NetLabel support, are included in the latest SELinux Reference Policy release 20070629 or later. Users who make use of NetLabel are strongly encouraged to upgrade their policy to avoid network problems. Users who do not make use of NetLabel will not notice any difference. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
| * SELinux: enable dynamic activation/deactivation of NetLabel/SELinux enforcementPaul Moore2007-07-191-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a new NetLabel KAPI interface, netlbl_enabled(), which reports on the current runtime status of NetLabel based on the existing configuration. LSMs that make use of NetLabel, i.e. SELinux, can use this new function to determine if they should perform NetLabel access checks. This patch changes the NetLabel/SELinux glue code such that SELinux only enforces NetLabel related access checks when netlbl_enabled() returns true. At present NetLabel is considered to be enabled when there is at least one labeled protocol configuration present. The result is that by default NetLabel is considered to be disabled, however, as soon as an administrator configured a CIPSO DOI definition NetLabel is enabled and SELinux starts enforcing NetLabel related access controls - including unlabeled packet controls. This patch also tries to consolidate the multiple "#ifdef CONFIG_NETLABEL" blocks into a single block to ease future review as recommended by Linus. Signed-off-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
* | coredump masking: reimplementation of dumpable using two flagsKawai, Hidehiro2007-07-192-2/+2
|/ | | | | | | | | | | | | | | | | This patch changes mm_struct.dumpable to a pair of bit flags. set_dumpable() converts three-value dumpable to two flags and stores it into lower two bits of mm_struct.flags instead of mm_struct.dumpable. get_dumpable() behaves in the opposite way. [akpm@linux-foundation.org: export set_dumpable] Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* usermodehelper: Tidy up waitingJeremy Fitzhardinge2007-07-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | Rather than using a tri-state integer for the wait flag in call_usermodehelper_exec, define a proper enum, and use that. I've preserved the integer values so that any callers I've missed should still work OK. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Andi Kleen <ak@suse.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Joel Becker <joel.becker@oracle.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Srivatsa Vaddagiri <vatsa@in.ibm.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: David Howells <dhowells@redhat.com>