summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Make sure sk_security is initialized on accept()ed socketsMiloslav Trmač2010-11-301-0/+1
|
* Add missing lockdep class namesMiloslav Trmač2010-11-301-3/+3
|
* crypto: algif_skcipher - Fixed overflow when sndbuf is page alignedherbertHerbert Xu2010-11-301-21/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hi: Just noticed that algif_skcipher fails to apply the sk_sndbuf limit correctly unless it is a multiple of PAGE_SIZE. What happens is that the merge path will exceed sk_sndbuf causing the subsequent limit comparison to fail as it tries to do an unsigned comparison with a negative value. This patch fixes the problem. commit 0f6bb83cb12e4617e696ffa566f3fc6c092686e2 Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue Nov 30 16:49:02 2010 +0800 crypto: algif_skcipher - Fixed overflow when sndbuf is page aligned When sk_sndbuf is not a multiple of PAGE_SIZE, the limit tests in sendmsg fail as the limit variable becomes negative and we're using an unsigned comparison. The same thing can happen if sk_sndbuf is lowered after a sendmsg call. This patch fixes this by always taking the signed maximum of limit and 0 before we perform the comparison. It also rounds the value of sk_sndbuf down to a multiple of PAGE_SIZE so that we don't end up allocating a page only to use a small number of bytes in it because we're bound by sk_sndbuf. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: algif_skcipher - Handle unaligned receive bufferHerbert Xu2010-11-301-6/+7
| | | | | | | | | | | | | | | | | | | | | | | Hi: This patch fixes unexpected EINVAL failures on recvmsg when encrypting/decrypting due to unaligned receive buffers. commit bc97e57eb21f8db55bf0e1f182d384e75b2e3c99 Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue Nov 30 17:04:31 2010 +0800 crypto: algif_skcipher - Handle unaligned receive buffer As it is if user-space passes through a receive buffer that's not aligned to to the cipher block size, we'll end up encrypting or decrypting a partial block which causes a spurious EINVAL to be returned. This patch fixes this by moving the partial block test after the af_alg_make_sg call. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: af_alg - Add dependency on NETHerbert Xu2010-11-301-0/+2
| | | | | | | | | | | | Add missing dependency on NET since we require sockets for our interface. Should really be a select but kconfig doesn't like that: net/Kconfig:6:error: found recursive dependency: NET -> NETWORK_FILESYSTEMS -> AFS_FS -> AF_RXRPC -> CRYPTO -> CRYPTO_USER_API_HASH -> CRYPTO_USER_API -> NET Reported-by: Zimny Lech <napohybelskurwysynom2010@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: algif_skcipher - Pass on error from af_alg_make_sgHerbert Xu2010-11-301-1/+2
| | | | | | | The error returned from af_alg_make_sg is currently lost and we always pass on -EINVAL. This patch pases on the underlying error. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* RFC: Crypto API User-interfaceHerbert Xu2010-11-281-1/+7
| | | | | | | | | | | | | | | | | | | | | | On Thu, Nov 04, 2010 at 01:43:16PM -0400, Miloslav Trmac wrote: > > shash_async_import() - it assumes that the struct shash_desc placed in ahash_request_ctx() of the struct ahash_request was initialized to point to the tfm, which is only done in shash_async_init(). Thanks for catching this. This patch should fix the problem. commit 8850e3641dcc7446628681bd7c5f771005e0b208 Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu Nov 4 13:00:22 2010 -0500 crypto: hash - Fix async import on shash algorithm The function shash_async_import did not initialise the descriptor correctly prior to calling the underlying shash import function. This patch adds the required initialisation. Reported-by: Miloslav Trmac <mitr@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: algif_skcipher - User-space interface for skcipher operationsHerbert Xu2010-11-283-0/+649
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | crypto: algif_skcipher - User-space interface for skcipher operations This patch adds the af_alg plugin for symmetric key ciphers, corresponding to the ablkcipher kernel operation type. Keys can optionally be set through the setsockopt interface. Once a sendmsg call occurs without MSG_MORE no further writes may be made to the socket until all previous data has been read. IVs and and whether encryption/decryption is performed can be set through the setsockopt interface or as a control message to sendmsg. The interface is completely synchronous, all operations are carried out in recvmsg(2) and will complete prior to the system call returning. The splice(2) interface support reading the user-space data directly without copying (except that the Crypto API itself may copy the data if alignment is off). The recvmsg(2) interface supports directly writing to user-space without additional copying, i.e., the kernel crypto interface will receive the user-space address as its output SG list. Thakns to Miloslav Trmac for reviewing this and contributing fixes and improvements. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: David S. Miller <davem@davemloft.net>
* crypto: algif_hash - User-space interface for hash operationsHerbert Xu2010-11-283-0/+328
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | crypto: algif_hash - User-space interface for hash operations This patch adds the af_alg plugin for hash, corresponding to the ahash kernel operation type. Keys can optionally be set through the setsockopt interface. Each sendmsg call will finalise the hash unless sent with a MSG_MORE flag. Partial hash states can be cloned using accept(2). The interface is completely synchronous, all operations will complete prior to the system call returning. Both sendmsg(2) and splice(2) support reading the user-space data directly without copying (except that the Crypto API itself may copy the data if alignment is off). For now only the splice(2) interface supports performing digest instead of init/update/final. In future the sendmsg(2) interface will also be modified to use digest/finup where possible so that hardware that cannot return a partial hash state can still benefit from this interface. Thakns to Miloslav Trmac for reviewing this and contributing fixes and improvements. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Martin Willi <martin@strongswan.org>
* crypto: af_alg - User-space interface for Crypto APIHerbert Xu2010-11-285-0/+618
| | | | | | | | | | | | | | | | | | | | | | | | | | | crypto: af_alg - User-space interface for Crypto API This patch creates the backbone of the user-space interface for the Crypto API, through a new socket family AF_ALG. Each session corresponds to one or more connections obtained from that socket. The number depends on the number of inputs/outputs of that particular type of operation. For most types there will be a s ingle connection/file descriptor that is used for both input and output. AEAD is one of the few that require two inputs. Each algorithm type will provide its own implementation that plugs into af_alg. They're keyed using a string such as "skcipher" or "hash". IOW this patch only contains the boring bits that is required to hold everything together. Thakns to Miloslav Trmac for reviewing this and contributing fixes and improvements. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Martin Willi <martin@strongswan.org>
* net - Add AF_ALG macrosHerbert Xu2010-11-281-1/+4
| | | | | | | | | | | net - Add AF_ALG macros This patch adds the socket family/level macros for the yet-to-be-born AF_ALG family. The AF_ALG family provides the user-space interface for the kernel crypto API. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: David S. Miller <davem@davemloft.net>
* Linux 2.6.35.6masterGreg Kroah-Hartman2010-09-261-1/+1
|
* alpha: Fix printk format errorsMichael Cree2010-09-261-3/+3
| | | | | | | | | | | | | | | | commit 3e073367a57d41e506f20aebb98e308387ce3090 upstream. When compiling alpha generic build get errors such as: arch/alpha/kernel/err_marvel.c: In function ‘marvel_print_err_cyc’: arch/alpha/kernel/err_marvel.c:119: error: format ‘%ld’ expects type ‘long int’, but argument 6 has type ‘u64’ Replaced a number of %ld format specifiers with %lld since u64 is unsigned long long. Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* drm/i915: Ensure that the crtcinfo is populated during mode_fixup()Chris Wilson2010-09-261-0/+8
| | | | | | | | | | | | | | | commit 897493504addc5609f04a2c4f73c37ab972c29b2 upstream. This should fix the mysterious mode setting failures reported during boot up and after resume, generally for i8xx class machines. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16478 Reported-and-tested-by: Xavier Chantry <chantry.xavier@gmail.com> Buzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29413 Tested-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sctp: Do not reset the packet during sctp_packet_config().Vlad Yasevich2010-09-261-1/+0
| | | | | | | | | | | | | | | | commit 4bdab43323b459900578b200a4b8cf9713ac8fab upstream. sctp_packet_config() is called when getting the packet ready for appending of chunks. The function should not touch the current state, since it's possible to ping-pong between two transports when sending, and that can result packet corruption followed by skb overlfow crash. Reported-by: Thomas Dreibholz <dreibh@iem.uni-due.de> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* Fix unprotected access to task credentials in waitid()Daniel J Blueman2010-09-261-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit f362b73244fb16ea4ae127ced1467dd8adaa7733 upstream. Using a program like the following: #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/wait.h> int main() { id_t id; siginfo_t infop; pid_t res; id = fork(); if (id == 0) { sleep(1); exit(0); } kill(id, SIGSTOP); alarm(1); waitid(P_PID, id, &infop, WCONTINUED); return 0; } to call waitid() on a stopped process results in access to the child task's credentials without the RCU read lock being held - which may be replaced in the meantime - eliciting the following warning: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- kernel/exit.c:1460 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 2 locks held by waitid02/22252: #0: (tasklist_lock){.?.?..}, at: [<ffffffff81061ce5>] do_wait+0xc5/0x310 #1: (&(&sighand->siglock)->rlock){-.-...}, at: [<ffffffff810611da>] wait_consider_task+0x19a/0xbe0 stack backtrace: Pid: 22252, comm: waitid02 Not tainted 2.6.35-323cd+ #3 Call Trace: [<ffffffff81095da4>] lockdep_rcu_dereference+0xa4/0xc0 [<ffffffff81061b31>] wait_consider_task+0xaf1/0xbe0 [<ffffffff81061d15>] do_wait+0xf5/0x310 [<ffffffff810620b6>] sys_waitid+0x86/0x1f0 [<ffffffff8105fce0>] ? child_wait_callback+0x0/0x70 [<ffffffff81003282>] system_call_fastpath+0x16/0x1b This is fixed by holding the RCU read lock in wait_task_continued() to ensure that the task's current credentials aren't destroyed between us reading the cred pointer and us reading the UID from those credentials. Furthermore, protect wait_task_stopped() in the same way. We don't need to keep holding the RCU read lock once we've read the UID from the credentials as holding the RCU read lock doesn't stop the target task from changing its creds under us - so the credentials may be outdated immediately after we've read the pointer, lock or no lock. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* guard page for stacks that grow upwardsLuck, Tony2010-09-263-8/+18
| | | | | | | | | | | | | | | commit 8ca3eb08097f6839b2206e2242db4179aee3cfb3 upstream. pa-risc and ia64 have stacks that grow upwards. Check that they do not run into other mappings. By making VM_GROWSUP 0x0 on architectures that do not ever use it, we can avoid some unpleasant #ifdefs in check_stack_guard_page(). Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: dann frazier <dannf@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* mm: page allocator: update free page counters after pages are placed on the ↵Mel Gorman2010-09-261-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | free list commit 72853e2991a2702ae93aaf889ac7db743a415dd3 upstream. When allocating a page, the system uses NR_FREE_PAGES counters to determine if watermarks would remain intact after the allocation was made. This check is made without interrupts disabled or the zone lock held and so is race-prone by nature. Unfortunately, when pages are being freed in batch, the counters are updated before the pages are added on the list. During this window, the counters are misleading as the pages do not exist yet. When under significant pressure on systems with large numbers of CPUs, it's possible for processes to make progress even though they should have been stalled. This is particularly problematic if a number of the processes are using GFP_ATOMIC as the min watermark can be accidentally breached and in extreme cases, the system can livelock. This patch updates the counters after the pages have been added to the list. This makes the allocator more cautious with respect to preserving the watermarks and mitigates livelock possibilities. [akpm@linux-foundation.org: avoid modifying incoming args] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Christoph Lameter <cl@linux.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* mm: page allocator: calculate a better estimate of NR_FREE_PAGES when memory ↵Christoph Lameter2010-09-265-3/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | is low and kswapd is awake commit aa45484031ddee09b06350ab8528bfe5b2c76d1c upstream. Ordinarily watermark checks are based on the vmstat NR_FREE_PAGES as it is cheaper than scanning a number of lists. To avoid synchronization overhead, counter deltas are maintained on a per-cpu basis and drained both periodically and when the delta is above a threshold. On large CPU systems, the difference between the estimated and real value of NR_FREE_PAGES can be very high. If NR_FREE_PAGES is much higher than number of real free page in buddy, the VM can allocate pages below min watermark, at worst reducing the real number of pages to zero. Even if the OOM killer kills some victim for freeing memory, it may not free memory if the exit path requires a new page resulting in livelock. This patch introduces a zone_page_state_snapshot() function (courtesy of Christoph) that takes a slightly more accurate view of an arbitrary vmstat counter. It is used to read NR_FREE_PAGES while kswapd is awake to avoid the watermark being accidentally broken. The estimate is not perfect and may result in cache line bounces but is expected to be lighter than the IPI calls necessary to continually drain the per-cpu counters while kswapd is awake. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* mm: page allocator: drain per-cpu lists after direct reclaim allocation failsMel Gorman2010-09-261-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 9ee493ce0a60bf42c0f8fd0b0fe91df5704a1cbf upstream. When under significant memory pressure, a process enters direct reclaim and immediately afterwards tries to allocate a page. If it fails and no further progress is made, it's possible the system will go OOM. However, on systems with large amounts of memory, it's possible that a significant number of pages are on per-cpu lists and inaccessible to the calling process. This leads to a process entering direct reclaim more often than it should increasing the pressure on the system and compounding the problem. This patch notes that if direct reclaim is making progress but allocations are still failing that the system is already under heavy pressure. In this case, it drains the per-cpu lists and tries the allocation a second time before continuing. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* dell-wmi: Add support for eject key on Dell Studio 1555Islam Amer2010-09-261-1/+1
| | | | | | | | | | | | | | commit d5164dbf1f651d1e955b158fb70a9c844cc91cd1 upstream. Fixes pressing the eject key on Dell Studio 1555 does not work and produces message : dell-wmi: Unknown key 0 pressed Signed-off-by: Islam Amer <pharon@gmail.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* Fix call to replaced SuperIO functionsMorten H. Larsen2010-09-263-8/+25
| | | | | | | | | | | | | | commit 59b25ed91400ace98d6cf0d59b1cb6928ad5cd37 upstream. This patch fixes the failure to compile Alpha Generic because of previously overlooked calls to ns87312_enable_ide(). The function has been replaced by newer SuperIO code. Tested-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Morten H. Larsen <m-larsen@post6.tele.dk> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* ALSA: hda - Fix beep frequency on IDT 92HD73xx and 92HD71Bxx codecsDaniel J Blueman2010-09-261-1/+11
| | | | | | | | | | | | | | | | | commit 1b0e372d7b52c9fc96348779015a6db7df7f286e upstream. Fix HDA beep frequency on IDT 92HD73xx and 92HD71Bxx codecs. These codecs use the standard beep frequency calculation although the datasheet says it's linear frequency. Other IDT/STAC codecs might have the same problem. They should be fixed individually later. Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Cc: أحمد المحمودي <aelmahmoudy@sabily.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* x86, asm: Use a lower case name for the end macro in atomic64_386_32.SLuca Barbieri2010-09-261-18/+20
| | | | | | | | | | | | | commit 417484d47e115774745ef025bce712a102b6f86f upstream. Use a lowercase name for the end macro, which somehow fixes a binutils 2.16 problem. Signed-off-by: Luca Barbieri <luca@luca-barbieri.com> LKML-Reference: <tip-30246557a06bb20618bed906a06d1e1e0faa8bb4@git.kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* PM / Hibernate: Avoid hitting OOM during preallocation of memoryRafael J. Wysocki2010-09-261-20/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 6715045ddc7472a22be5e49d4047d2d89b391f45 upstream. There is a problem in hibernate_preallocate_memory() that it calls preallocate_image_memory() with an argument that may be greater than the total number of available non-highmem memory pages. If that's the case, the OOM condition is guaranteed to trigger, which in turn can cause significant slowdown to occur during hibernation. To avoid that, make preallocate_image_memory() adjust its argument before calling preallocate_image_pages(), so that the total number of saveable non-highem pages left is not less than the minimum size of a hibernation image. Change hibernate_preallocate_memory() to try to allocate from highmem if the number of pages allocated by preallocate_image_memory() is too low. Modify free_unnecessary_pages() to take all possible memory allocation patterns into account. Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Tested-by: M. Vefa Bicakci <bicave@superonline.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* PM: Prevent waiting forever on asynchronous resume after failing suspendColin Cross2010-09-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 152e1d592071c8b312bb898bc1118b64e4aea535 upstream. During suspend, the power.completion is expected to be set when a device has not yet started suspending. Set it on init to fix a corner case where a device is resumed when its parent has never suspended. Consider three drivers, A, B, and C. The parent of A is C, and C has async_suspend set. On boot, C->power.completion is initialized to 0. During the first suspend: suspend_devices_and_enter(...) dpm_resume(...) device_suspend(A) device_suspend(B) returns error, aborts suspend dpm_resume_end(...) dpm_resume(...) device_resume(A) dpm_wait(A->parent == C) wait_for_completion(C->power.completion) The wait_for_completion will never complete, because complete_all(C->power.completion) will only be called from device_suspend(C) or device_resume(C), neither of which is called if suspend is aborted before C. After a successful suspend->resume cycle, where B doesn't abort suspend, C->power.completion is left in the completed state by the call to device_resume(C), and the same call path will work if B aborts suspend. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* AT91: change dma resource indexNicolas Ferre2010-09-261-1/+1
| | | | | | | | | commit 8d2602e0778299e2d6084f03086b716d6e7a1e1e upstream. Reported-by: Dan Liang <dan.liang@atmel.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* drivers/video/via/ioctl.c: prevent reading uninitialized stack memoryDan Rosenberg2010-09-261-0/+2
| | | | | | | | | | | | | | | commit b4aaa78f4c2f9cde2f335b14f4ca30b01f9651ca upstream. The VIAFB_GET_INFO device ioctl allows unprivileged users to read 246 bytes of uninitialized stack memory, because the "reserved" member of the viafb_ioctl_info struct declared on the stack is not altered or zeroed before being copied back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* xfs: prevent reading uninitialized stack memoryDan Rosenberg2010-09-261-0/+2
| | | | | | | | | | | | | | | | | commit a122eb2fdfd78b58c6dd992d6f4b1aaef667eef9 upstream. The XFS_IOC_FSGETXATTR ioctl allows unprivileged users to read 12 bytes of uninitialized stack memory, because the fsxattr struct declared on the stack in xfs_ioc_fsgetxattr() does not alter (or zero) the 12-byte fsx_pad member before copying it back to the user. This patch takes care of it. Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com> Cc: dann frazier <dannf@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* KEYS: Fix bug in keyctl_session_to_parent() if parent has no session keyringDavid Howells2010-09-261-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 3d96406c7da1ed5811ea52a3b0905f4f0e295376 upstream. Fix a bug in keyctl_session_to_parent() whereby it tries to check the ownership of the parent process's session keyring whether or not the parent has a session keyring [CVE-2010-2960]. This results in the following oops: BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0 IP: [<ffffffff811ae4dd>] keyctl_session_to_parent+0x251/0x443 ... Call Trace: [<ffffffff811ae2f3>] ? keyctl_session_to_parent+0x67/0x443 [<ffffffff8109d286>] ? __do_fault+0x24b/0x3d0 [<ffffffff811af98c>] sys_keyctl+0xb4/0xb8 [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b if the parent process has no session keyring. If the system is using pam_keyinit then it mostly protected against this as all processes derived from a login will have inherited the session keyring created by pam_keyinit during the log in procedure. To test this, pam_keyinit calls need to be commented out in /etc/pam.d/. Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Tavis Ormandy <taviso@cmpxchg8b.com> Cc: dann frazier <dannf@debian.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* KEYS: Fix RCU no-lock warning in keyctl_session_to_parent()David Howells2010-09-261-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 9d1ac65a9698513d00e5608d93fca0c53f536c14 upstream. There's an protected access to the parent process's credentials in the middle of keyctl_session_to_parent(). This results in the following RCU warning: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- security/keys/keyctl.c:1291 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by keyctl-session-/2137: #0: (tasklist_lock){.+.+..}, at: [<ffffffff811ae2ec>] keyctl_session_to_parent+0x60/0x236 stack backtrace: Pid: 2137, comm: keyctl-session- Not tainted 2.6.36-rc2-cachefs+ #1 Call Trace: [<ffffffff8105606a>] lockdep_rcu_dereference+0xaa/0xb3 [<ffffffff811ae379>] keyctl_session_to_parent+0xed/0x236 [<ffffffff811af77e>] sys_keyctl+0xb4/0xb6 [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b The code should take the RCU read lock to make sure the parents credentials don't go away, even though it's holding a spinlock and has IRQ disabled. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: dann frazier <dannf@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* IA64: Optimize ticket spinlocks in fsys_rt_sigprocmaskPetr Tesarik2010-09-261-31/+11
| | | | | | | | | | | | | commit 2d2b6901649a62977452be85df53eda2412def24 upstream. Tony's fix (f574c843191728d9407b766a027f779dcd27b272) has a small bug, it incorrectly uses "r3" as a scratch register in the first of the two unlock paths ... it is also inefficient. Optimize the fast path again. Signed-off-by: Petr Tesarik <ptesarik@suse.cz> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* IA64: fix siglockTony Luck2010-09-261-7/+39
| | | | | | | | | | | | | | | | | | | | | commit f574c843191728d9407b766a027f779dcd27b272 upstream. When ia64 converted to using ticket locks, an inline implementation of trylock/unlock in fsys.S was missed. This was not noticed because in most circumstances it simply resulted in using the slow path because the siglock was apparently not available (under old spinlock rules). Problems occur when the ticket spinlock has value 0x0 (when first initialised, or when it wraps around). At this point the fsys.S code acquires the lock (changing the 0x0 to 0x1. If another process attempts to get the lock at this point, it will change the value from 0x1 to 0x2 (using new ticket lock rules). Then the fsys.S code will free the lock using old spinlock rules by writing 0x0 to it. From here a variety of bad things can happen. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* KVM: VMX: Fix host GDT.LIMIT corruptionAvi Kivity2010-09-261-0/+4
| | | | | | | | | | | | | commit 3444d7da1839b851eefedd372978d8a982316c36 upstream. vmx does not restore GDT.LIMIT to the host value, instead it sets it to 64KB. This means host userspace can learn a few bits of host memory. Fix by reloading GDTR when we load other host state. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* KVM: MMU: fix mmu notifier invalidate handler for huge spteAndrea Arcangeli2010-09-261-2/+6
| | | | | | | | | | | | | commit 6e3e243c3b6e0bbd18c6ce0fbc12bc3fe2d77b34 upstream. The index wasn't calculated correctly (off by one) for huge spte so KVM guest was unstable with transparent hugepages. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* KVM: x86: emulator: inc/dec can have lock prefixGleb Natapov2010-09-261-2/+2
| | | | | | | | | | commit c0e0608cb902af1a1fd8d413ec0a07ee1e62c652 upstream. Mark inc (0xfe/0 0xff/0) and dec (0xfe/1 0xff/1) as lock prefix capable. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* KVM: MMU: fix direct sp's access corruptedXiao Guangrong2010-09-261-2/+26
| | | | | | | | | | | | | | | | | commit 9e7b0e7fba45ca3c6357aeb7091ebc281f1de365 upstream. If the mapping is writable but the dirty flag is not set, we will find the read-only direct sp and setup the mapping, then if the write #PF occur, we will mark this mapping writable in the read-only direct sp, now, other real read-only mapping will happily write it without #PF. It may hurt guest's COW Fixed by re-install the mapping when write #PF occur. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* KVM: Prevent internal slots from being COWedAvi Kivity2010-09-261-1/+6
| | | | | | | | | | | | | commit 7ac77099ce88a0c31b75acd0ec5ef3da4415a6d8 upstream. If a process with a memory slot is COWed, the page will change its address (despite having an elevated reference count). This breaks internal memory slots which have their physical addresses loaded into vmcs registers (see the APIC access memory slot). Signed-off-by: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* KVM: Keep slot ID in memory slot structureAvi Kivity2010-09-262-0/+2
| | | | | | | | | | | | commit e36d96f7cfaa71870c407131eb4fbd38ea285c01 upstream. May be used for distinguishing between internal and user slots, or for sorting slots in size order. Signed-off-by: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* SCSI: mptsas: fix hangs caused by ATA pass-throughRyan Kuester2010-09-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 2a1b7e575b80ceb19ea50bfa86ce0053ea57181d upstream. I may have an explanation for the LSI 1068 HBA hangs provoked by ATA pass-through commands, in particular by smartctl. First, my version of the symptoms. On an LSI SAS1068E B3 HBA running 01.29.00.00 firmware, with SATA disks, and with smartd running, I'm seeing occasional task, bus, and host resets, some of which lead to hard faults of the HBA requiring a reboot. Abusively looping the smartctl command, # while true; do smartctl -a /dev/sdb > /dev/null; done dramatically increases the frequency of these failures to nearly one per minute. A high IO load through the HBA while looping smartctl seems to improve the chance of a full scsi host reset or a non-recoverable hang. I reduced what smartctl was doing down to a simple test case which causes the hang with a single IO when pointed at the sd interface. See the code at the bottom of this e-mail. It uses an SG_IO ioctl to issue a single pass-through ATA identify device command. If the buffer userspace gives for the read data has certain alignments, the task is issued to the HBA but the HBA fails to respond. If run against the sg interface, neither the test code nor smartctl causes a hang. sd and sg handle the SG_IO ioctl slightly differently. Unless you specifically set a flag to do direct IO, sg passes a buffer of its own, which is page-aligned, to the block layer and later copies the result into the userspace buffer regardless of its alignment. sd, on the other hand, always does direct IO unless the userspace buffer fails an alignment test at block/blk-map.c line 57, in which case a page-aligned buffer is created and used for the transfer. The alignment test currently checks for word-alignment, the default setup by scsi_lib.c; therefore, userspace buffers of almost any alignment are given directly to the HBA as DMA targets. The LSI 1068 hardware doesn't seem to like at least a couple of the alignments which cross a page boundary (see the test code below). Curiously, many page-boundary-crossing alignments do work just fine. So, either the hardware has an bug handling certain alignments or the hardware has a stricter alignment requirement than the driver is advertising. If stricter alignment is required, then in no case should misaligned buffers from userspace be allowed through without being bounced or at least causing an error to be returned. It seems the mptsas driver could use blk_queue_dma_alignment() to advertise a stricter alignment requirement. If it does, sd does the right thing and bounces misaligned buffers (see block/blk-map.c line 57). The following patch to 2.6.34-rc5 makes my symptoms go away. I'm sure this is the wrong place for this code, but it gets my idea across. Acked-by: Kashyap Desai <Kashyap.Desai@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* inotify: send IN_UNMOUNT eventsEric Paris2010-09-261-2/+5
| | | | | | | | | | | | commit 611da04f7a31b2208e838be55a42c7a1310ae321 upstream. Since the .31 or so notify rewrite inotify has not sent events about inodes which are unmounted. This patch restores those events. Signed-off-by: Eric Paris <eparis@redhat.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* drm/nv50: initialize ramht_refs list for faked 0 channelMarcin Slusarz2010-09-261-0/+2
| | | | | | | | | | | | | | commit 615661f3948a066fd22a36fe8ea0c528b75ee373 upstream. We need it for PFIFO_INTR_CACHE_ERROR interrupt handling, because nouveau_fifo_swmthd looks for matching gpuobj in ramht_refs list. It fixes kernel panic in nouveau_gpuobj_ref_find. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* GFS2: gfs2_logd should be using interruptible waitsSteven Whitehouse2010-09-261-1/+1
| | | | | | | | | | | commit 5f4874903df3562b9d5649fc1cf7b8c6bb238e42 upstream. Looks like this crept in, in a recent update. Reported-by: Krzysztof Urbaniak <urban@bash.org.pl> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* x86 platform drivers: hp-wmi Reorder event id processingThomas Renninger2010-09-261-19/+32
| | | | | | | | | | | | | | | | | commit 751ae808f6b29803228609f51aa1ae057f5c576e upstream. Event id 0x4 defines the hotkey event. No need (or even wrong) to query HPWMI_HOTKEY_QUERY if event id is != 0x4. Reorder the eventcode conditionals and use switch case instead of if/else. Use an enum for the event ids cases. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Matthew Garrett <mjg@redhat.com> CC: linux-acpi@vger.kernel.org CC: platform-driver-x86@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* aio: check for multiplication overflow in do_io_submitJeff Moyer2010-09-261-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | commit 75e1c70fc31490ef8a373ea2a4bea2524099b478 upstream. Tavis Ormandy pointed out that do_io_submit does not do proper bounds checking on the passed-in iocb array:        if (unlikely(nr < 0))                return -EINVAL;        if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(iocbpp)))))                return -EFAULT;                      ^^^^^^^^^^^^^^^^^^ The attached patch checks for overflow, and if it is detected, the number of iocbs submitted is scaled down to a number that will fit in the long.  This is an ok thing to do, as sys_io_submit is documented as returning the number of iocbs submitted, so callers should handle a return value of less than the 'nr' argument passed in. Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* aio: do not return ERESTARTSYS as a result of AIOJan Kara2010-09-261-1/+9
| | | | | | | | | | | | | | | | | | | | | | commit a0c42bac79731276c9b2f28d54f9e658fcf843a2 upstream. OCFS2 can return ERESTARTSYS from its write function when the process is signalled while waiting for a cluster lock (and the filesystem is mounted with intr mount option). Generally, it seems reasonable to allow filesystems to return this error code from its IO functions. As we must not leak ERESTARTSYS (and similar error codes) to userspace as a result of an AIO operation, we have to properly convert it to EINTR inside AIO code (restarting the syscall isn't really an option because other AIO could have been already submitted by the same io_submit syscall). Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Zach Brown <zach.brown@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* percpu: fix pcpu_last_unit_cpuTejun Heo2010-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | commit 46b30ea9bc3698bc1d1e6fd726c9601d46fa0a91 upstream. pcpu_first/last_unit_cpu are used to track which cpu has the first and last units assigned. This in turn is used to determine the span of a chunk for man/unmap cache flushes and whether an address belongs to the first chunk or not in per_cpu_ptr_to_phys(). When the number of possible CPUs isn't power of two, a chunk may contain unassigned units towards the end of a chunk. The logic to determine pcpu_last_unit_cpu was incorrect when there was an unused unit at the end of a chunk. It failed to ignore the unused unit and assigned the unused marker NR_CPUS to pcpu_last_unit_cpu. This was discovered through kdump failure which was caused by malfunctioning per_cpu_ptr_to_phys() on a kvm setup with 50 possible CPUs by CAI Qian. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* vmscan: check all_unreclaimable in direct reclaim pathMinchan Kim2010-09-261-8/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit d1908362ae0b97374eb8328fbb471576332f9fb1 upstream. M. Vefa Bicakci reported 2.6.35 kernel hang up when hibernation on his 32bit 3GB mem machine. (https://bugzilla.kernel.org/show_bug.cgi?id=16771). Also he bisected the regression to commit bb21c7ce18eff8e6e7877ca1d06c6db719376e3c Author: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Date: Fri Jun 4 14:15:05 2010 -0700 vmscan: fix do_try_to_free_pages() return value when priority==0 reclaim failure At first impression, this seemed very strange because the above commit only chenged function return value and hibernate_preallocate_memory() ignore return value of shrink_all_memory(). But it's related. Now, page allocation from hibernation code may enter infinite loop if the system has highmem. The reasons are that vmscan don't care enough OOM case when oom_killer_disabled. The problem sequence is following as. 1. hibernation 2. oom_disable 3. alloc_pages 4. do_try_to_free_pages if (scanning_global_lru(sc) && !all_unreclaimable) return 1; If kswapd is not freozen, it would set zone->all_unreclaimable to 1 and then shrink_zones maybe return true(ie, all_unreclaimable is true). So at last, alloc_pages could go to _nopage_. If it is, it should have no problem. This patch adds all_unreclaimable check to protect in direct reclaim path, too. It can care of hibernation OOM case and help bailout all_unreclaimable case slightly. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Reported-by: M. Vefa Bicakci <bicave@superonline.com> Reported-by: <caiqian@redhat.com> Reviewed-by: Johannes Weiner <hannes@cmpxchg.org> Tested-by: <caiqian@redhat.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* /proc/vmcore: fix seekingArnd Bergmann2010-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | commit c227e69028473c7c7994a9b0a2cc0034f3f7e0fe upstream. Commit 73296bc611 ("procfs: Use generic_file_llseek in /proc/vmcore") broke seeking on /proc/vmcore. This changes it back to use default_llseek in order to restore the original behaviour. The problem with generic_file_llseek is that it only allows seeks up to inode->i_sb->s_maxbytes, which is zero on procfs and some other virtual file systems. We should merge generic_file_llseek and default_llseek some day and clean this up in a proper way, but for 2.6.35/36, reverting vmcore is the safer solution. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Reported-by: CAI Qian <caiqian@redhat.com> Tested-by: CAI Qian <caiqian@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* Prevent freeing uninitialized pointer in compat_do_readv_writevDan Rosenberg2010-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | commit 767b68e96993e29e3480d7ecdd9c4b84667c5762 upstream. In 32-bit compatibility mode, the error handling for compat_do_readv_writev() may free an uninitialized pointer, potentially leading to all sorts of ugly memory corruption. This is reliably triggerable by unprivileged users by invoking the readv()/writev() syscalls with an invalid iovec pointer. The below patch fixes this to emulate the non-compat version. Introduced by commit b83733639a49 ("compat: factor out compat_rw_copy_check_uvector from compat_do_readv_writev") Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>