summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2008-11-067-26/+29
|\ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: ieee1394: dv1394: fix possible deadlock in multithreaded clients ieee1394: raw1394: fix possible deadlock in multithreaded clients ieee1394: struct device - replace bus_id with dev_name(), dev_set_name() firewire: struct device - replace bus_id with dev_name(), dev_set_name()
| * ieee1394: dv1394: fix possible deadlock in multithreaded clientsStefan Richter2008-10-311-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a possible though highly unlikely deadlock: Thread A: Thread B: - acquire mmap_sem - dv1394_ioctl/read/write() - dv1394_mmap() - acquire video->mtx - acquire video->mtx - copy_to/from_user(), possible page fault: acquire mmap_sem The simplest fix is to use mutex_trylock() instead of mutex_lock() in dv1394_mmap(). This changes the behavior under contention in a way which is visible to userspace clients. However, my guess is that no clients exist which use mmap vs. ioctl/read/write on the dv1394 character device file interface in concurrent threads. Reported-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
| * ieee1394: raw1394: fix possible deadlock in multithreaded clientsStefan Richter2008-10-311-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Regression in 2.6.28-rc1: When I added the new state_mutex which prevents corruption of raw1394's internal state when accessed by multithreaded client applications, the following possible though highly unlikely deadlock slipped in: Thread A: Thread B: - acquire mmap_sem - raw1394_write() or raw1394_ioctl() - raw1394_mmap() - acquire state_mutex - acquire state_mutex - copy_to/from_user(), possible page fault: acquire mmap_sem The simplest fix is to use mutex_trylock() instead of mutex_lock() in raw1394_mmap(). This changes the behavior under contention in a way which is visible to userspace clients. However, since multithreaded access was entirely buggy before state_mutex was added and libraw1394's documentation advised application programmers to use a handle only in a single thread, this change in behaviour should not be an issue in practice at all. Since we have to use mutex_trylock() in raw1394_mmap() regardless whether /dev/raw1394 was opened with O_NONBLOCK or not, we now use mutex_trylock() unconditionally everywhere for state_mutex, just to have consistent behavior. Reported-by: Johannes Weiner <hannes@saeurebad.de> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
| * ieee1394: struct device - replace bus_id with dev_name(), dev_set_name()Kay Sievers2008-10-312-11/+7
| | | | | | | | | | | | Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
| * firewire: struct device - replace bus_id with dev_name(), dev_set_name()Kay Sievers2008-10-313-10/+8
| | | | | | | | | | | | Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
* | Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2008-11-069-95/+143
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-2.6-block: Block: use round_jiffies_up() Add round_jiffies_up and related routines block: fix __blkdev_get() for removable devices generic-ipi: fix the smp_mb() placement blk: move blk_delete_timer call in end_that_request_last block: add timer on blkdev_dequeue_request() not elv_next_request() bio: define __BIOVEC_PHYS_MERGEABLE block: remove unused ll_new_mergeable()
| * | Block: use round_jiffies_up()Alan Stern2008-11-061-17/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch (as1159b) changes the timeout routines in the block core to use round_jiffies_up(). There's no point in rounding the timer deadline down, since if it expires too early we will have to restart it. The patch also removes some unnecessary tests when a request is removed from the queue's timer list. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | Add round_jiffies_up and related routinesAlan Stern2008-11-062-30/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch (as1158b) adds round_jiffies_up() and friends. These routines work like the analogous round_jiffies() functions, except that they will never round down. The new routines will be useful for timeouts where we don't care exactly when the timer expires, provided it doesn't expire too soon. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | block: fix __blkdev_get() for removable devicesTejun Heo2008-11-061-12/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0762b8bde9729f10f8e6249809660ff2ec3ad735 moved disk_get_part() in front of recursive get on the whole disk, which caused removable devices to try disk_get_part() before rescanning after a new media is inserted, which might fail legit open attempts or give the old partition. This patch fixes the problem by moving disk_get_part() after __blkdev_get() on the whole disk. This problem was spotted by Borislav Petkov. Signed-off-by: Tejun Heo <tj@kernel.org> Tested-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | generic-ipi: fix the smp_mb() placementSuresh Siddha2008-11-061-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | smp_mb() is needed (to make the memory operations visible globally) before sending the ipi on the sender and the receiver (on Alpha atleast) needs smp_read_barrier_depends() in the handler before reading the call_single_queue list in a lock-free fashion. On x86, x2apic mode register accesses for sending IPI's don't have serializing semantics. So the need for smp_mb() before sending the IPI becomes more critical in x2apic mode. Remove the unnecessary smp_mb() in csd_flag_wait(), as the presence of that smp_mb() doesn't mean anything on the sender, when the ipi receiver is not doing any thing special (like memory fence) after clearing the CSD_FLAG_WAIT. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | blk: move blk_delete_timer call in end_that_request_lastMike Anderson2008-11-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the calling blk_delete_timer to later in end_that_request_last to address an issue where blkdev_dequeue_request may have add a timer for the request. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | block: add timer on blkdev_dequeue_request() not elv_next_request()Tejun Heo2008-11-061-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Block queue supports two usage models - one where block driver peeks at the front of queue using elv_next_request(), processes it and finishes it and the other where block driver peeks at the front of queue, dequeue the request using blkdev_dequeue_request() and finishes it. The latter is more flexible as it allows the driver to process multiple commands concurrently. These two inconsistent usage models affect the block layer implementation confusing. For some, elv_next_request() is considered the issue point while others consider blkdev_dequeue_request() the issue point. Till now the inconsistency mostly affect only accounting, so it didn't really break anything seriously; however, with block layer timeout, this inconsistency hits hard. Block layer considers elv_next_request() the issue point and adds timer but SCSI layer thinks it was just peeking and when the request can't process the command right away, it's just left there without further processing. This makes the request dangling on the timer list and, when the timer goes off, the request which the SCSI layer and below think is still on the block queue ends up in the EH queue, causing various problems - EH hang (failed count goes over busy count and EH never wakes up), WARN_ON() and oopses as low level driver trying to handle the unknown command, etc. depending on the timing. As SCSI midlayer is the only user of block layer timer at the moment, moving blk_add_timer() to elv_dequeue_request() fixes the problem; however, this two usage models definitely need to be cleaned up in the future. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | bio: define __BIOVEC_PHYS_MERGEABLEJeremy Fitzhardinge2008-11-061-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | Define __BIOVEC_PHYS_MERGEABLE as the default implementation of BIOVEC_PHYS_MERGEABLE, so that its available for reuse within an arch-specific definition of BIOVEC_PHYS_MERGEABLE. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | block: remove unused ll_new_mergeable()FUJITA Tomonori2008-11-061-21/+0
| | | | | | | | | | | | | | | Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdogLinus Torvalds2008-11-061-4/+4
|\ \ \ | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: [WATCHDOG] SAM9 watchdog - supported on all SAM9 and CAP9 processors [WATCHDOG] SAM9 watchdog - update for moved headers
| * | | [WATCHDOG] SAM9 watchdog - supported on all SAM9 and CAP9 processorsAndrew Victor2008-11-061-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SAM9 watchdog driver is usable on the whole family of AT91SAM9 and CAP9 processors. Update the configuration to indicate this and allow the driver to be selected. Signed-off-by: Andrew Victor <linux@maxim.org.za> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| * | | [WATCHDOG] SAM9 watchdog - update for moved headersAndrew Victor2008-11-061-1/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The architecture header files were recently moved from include/asm-arm/mach-at91/ to arch/arm/mach-at91/include/mach/. The SAM9 watchdog driver still includes a header from the old location. Signed-off-by: Andrew Victor <linux@maxim.org.za> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* | | Merge branch 'for-linus' of git://neil.brown.name/mdLinus Torvalds2008-11-063-7/+3
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://neil.brown.name/md: md: linear: Fix a division by zero bug for very small arrays. md: fix bug in raid10 recovery. md: revert the recent addition of a call to the BLKRRPART ioctl.
| * | | md: linear: Fix a division by zero bug for very small arrays.Andre Noll2008-11-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently oops with a divide error on starting a linear software raid array consisting of at least two very small (< 500K) devices. The bug is caused by the calculation of the hash table size which tries to compute sector_div(sz, base) with "base" being zero due to the small size of the component devices of the array. Fix this by requiring the hash spacing to be at least one which implies that also "base" is non-zero. This bug has existed since about 2.6.14. Cc: stable@kernel.org Signed-off-by: Andre Noll <maan@systemlinux.org> Signed-off-by: NeilBrown <neilb@suse.de>
| * | | md: fix bug in raid10 recovery.NeilBrown2008-11-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding a spare to a raid10 doesn't cause recovery to start. This is due to an silly type in commit 6c2fce2ef6b4821c21b5c42c7207cb9cf8c87eda and so is a bug in 2.6.27 and .28-rc. Thanks to Thomas Backlund for bisecting to find this. Cc: Thomas Backlund <tmb@mandriva.org> Cc: stable@kernel.org Signed-off-by: NeilBrown <neilb@suse.de>
| * | | md: revert the recent addition of a call to the BLKRRPART ioctl.NeilBrown2008-11-061-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It turns out that it is only safe to call blkdev_ioctl when the device is actually open (as ->bd_disk is set to NULL on last close). And it is quite possible for do_md_stop to be called when the device is not open. So discard the call to blkdev_ioctl(BLKRRPART) which was added in commit 934d9c23b4c7e31840a895ba4b7e88d6413c81f3 It is just as easy to call this ioctl from userspace when needed (on mdadm -S) so leave it out of the kernel Signed-off-by: NeilBrown <neilb@suse.de>
* | | | Merge branch 'merge' of ↵Linus Torvalds2008-11-063-1/+2
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: powerpc: Fix "unused variable" warning in pci_dlpar.c powerpc/cell: Fix compile error in ras.c powerpc/ps3: Fix compile error in ps3-lpm.c
| * | | | powerpc: Fix "unused variable" warning in pci_dlpar.cStephen Rothwell2008-11-051-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This gets rid of this build warning: arch/powerpc/platforms/pseries/pci_dlpar.c: In function 'init_phb_dynamic': arch/powerpc/platforms/pseries/pci_dlpar.c:192: warning: unused variable 'b' This is one of the very few warnings left in a ppc64_defconfig build and getting rid of it will make it easier to see future introduced ones (in fact this was introduced very recently). Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | powerpc/cell: Fix compile error in ras.cAlexey Dobriyan2008-11-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes this error on Cell when CONFIG_KEXEC = n: arch/powerpc/platforms/cell/ras.c:299: error: implicit declaration of function 'crash_shutdown_register' We have to include <asm/kexec.h> because it contains the dummy definition of crash_shutdown_register that is used when CONFIG_KEXEC=n, but <linux/kexec.h> doesn't include <asm/kexec.h> in that case. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | | | powerpc/ps3: Fix compile error in ps3-lpm.cAlexey Dobriyan2008-11-051-0/+1
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compiling with CONFIG_SMP = n and CONFIG_PS3_LPM != n gives this error: drivers/ps3/ps3-lpm.c:838: error: implicit declaration of function 'get_hard_smp_processor_id' This fixes it. We have to include <asm/smp.h> rather than <linux/smp.h> because the UP definition of get_hard_smp_processor_id() is in <asm/smp.h>, and <linux/smp.h> only includes <asm/smp.h> if CONFIG_SMP = y. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2008-11-062-27/+37
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p: fix printk format warnings unsigned fid->fid cannot be negative 9p: rdma: remove duplicated #include p9: Fix leak of waitqueue in request allocation path 9p: Remove unneeded free of fcall for Flush 9p: Make all client spin locks IRQ safe 9p: rdma: Set trans prior to requesting async connection ops
| * | | | net/9p: fix printk format warningsRandy Dunlap2008-11-051-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix printk format warnings in net/9p. Built cleanly on 7 arches. net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64' net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64' net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64' net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64' net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
| * | | | unsigned fid->fid cannot be negativeRoel Kluin2008-11-051-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
| * | | | 9p: rdma: remove duplicated #includeHuang Weiyi2008-11-051-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed duplicated #include <rdma/ib_verbs.h> in net/9p/trans_rdma.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
| * | | | p9: Fix leak of waitqueue in request allocation pathTom Tucker2008-11-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a T or R fcall cannot be allocated, the function returns an error but neglects to free the wait queue that was successfully allocated. If it comes through again a second time this wq will be overwritten with a new allocation and the old allocation will be leaked. Also, if the client is subsequently closed, the close path will attempt to clean up these allocations, so set the req fields to NULL to avoid duplicate free. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
| * | | | 9p: Remove unneeded free of fcall for FlushTom Tucker2008-11-051-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | T and R fcall are reused until the client is destroyed. There does not need to be a special case for Flush Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
| * | | | 9p: Make all client spin locks IRQ safeTom Tucker2008-11-051-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The client lock must be IRQ safe. Some of the lock acquisition paths took regular spin locks. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
| * | | | 9p: rdma: Set trans prior to requesting async connection opsTom Tucker2008-11-051-1/+3
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RDMA connection manager is fundamentally asynchronous. Since the async callback context is the client pointer, the transport in the client struct needs to be set prior to calling the first async op. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
* | | | Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds2008-11-065-34/+60
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: re-tune balancing sched: fix buddies for group scheduling sched: backward looking buddy sched: fix fair preempt check sched: cleanup fair task selection
| * | | | sched: re-tune balancingIngo Molnar2008-11-052-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: improve wakeup affinity on NUMA systems, tweak SMP systems Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain balancing defaults on NUMA and SMP systems. Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason why we would not want to have wakeup affinity across nodes as well. (we already do this in the standard NUMA template.) lat_ctx on a NUMA box is particularly happy about this change: before: | phoenix:~/l> ./lat_ctx -s 0 2 | "size=0k ovr=2.60 | 2 5.70 after: | phoenix:~/l> ./lat_ctx -s 0 2 | "size=0k ovr=2.65 | 2 2.07 a 2.75x speedup. pipe-test is similarly happy about it too: | phoenix:~/sched-tests> ./pipe-test | 18.26 usecs/loop. | 14.70 usecs/loop. | 14.38 usecs/loop. | 10.55 usecs/loop. # +WAKE_AFFINE on domain0+domain1 | 8.63 usecs/loop. | 8.59 usecs/loop. | 9.03 usecs/loop. | 8.94 usecs/loop. | 8.96 usecs/loop. | 8.63 usecs/loop. Also: - disable SD_BALANCE_NEWIDLE on NUMA and SMP domains (keep it for siblings) - enable SD_WAKE_BALANCE on SMP domains Sysbench+postgresql improves all around the board, quite significantly: .28-rc3-11474e2c .28-rc3-11474e2c-tune ------------------------------------------------- 1: 571 688 +17.08% 2: 1236 1206 -2.55% 4: 2381 2642 +9.89% 8: 4958 5164 +3.99% 16: 9580 9574 -0.07% 32: 7128 8118 +12.20% 64: 7342 8266 +11.18% 128: 7342 8064 +8.95% 256: 7519 7884 +4.62% 512: 7350 7731 +4.93% ------------------------------------------------- SUM: 55412 59341 +6.62% So it's a win both for the runup portion, the peak area and the tail. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | sched: fix buddies for group schedulingPeter Zijlstra2008-11-051-2/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: scheduling order fix for group scheduling For each level in the hierarchy, set the buddy to point to the right entity. Therefore, when we do the hierarchical schedule, we have a fair chance of ending up where we meant to. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | sched: backward looking buddyPeter Zijlstra2008-11-053-9/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: improve/change/fix wakeup-buddy scheduling Currently we only have a forward looking buddy, that is, we prefer to schedule to the task we last woke up, under the presumption that its going to consume the data we just produced, and therefore will have cache hot benefits. This allows co-waking producer/consumer task pairs to run ahead of the pack for a little while, keeping their cache warm. Without this, we would interleave all pairs, utterly trashing the cache. This patch introduces a backward looking buddy, that is, suppose that in the above scenario, the consumer preempts the producer before it can go to sleep, we will therefore miss the wakeup from consumer to producer (its already running, after all), breaking the cycle and reverting to the cache-trashing interleaved schedule pattern. The backward buddy will try to schedule back to the task that woke us up in case the forward buddy is not available, under the assumption that the last task will be the one with the most cache hot task around barring current. This will basically allow a task to continue after it got preempted. In order to avoid starvation, we allow either buddy to get wakeup_gran ahead of the pack. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | sched: fix fair preempt checkPeter Zijlstra2008-11-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: fix cross-class preemption Inter-class wakeup preemptions should go on class order. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | sched: cleanup fair task selectionPeter Zijlstra2008-11-051-22/+11
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Impact: cleanup Clean up task selection Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | Fix accidental implicit cast in HR-timer conversionDavid Howells2008-11-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the hrtimer_add_expires_ns() function. It should take a 'u64 ns' argument, but rather takes an 'unsigned long ns' argument - which might only be 32-bits. On FRV, this results in the kernel locking up because hrtimer_forward() passes the result of a 64-bit multiplication to this function, for which the compiler discards the top 32-bits - something that didn't happen when ktime_add_ns() was called directly. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge git://git.infradead.org/mtd-2.6Linus Torvalds2008-11-066-33/+39
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.infradead.org/mtd-2.6: [JFFS2] fix race condition in jffs2_lzo_compress() [MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4) [JFFS2] Fix lack of locking in thread_should_wake() [JFFS2] Fix build failure with !CONFIG_JFFS2_FS_WRITEBUFFER [MTD] [NAND] OMAP2: remove duplicated #include
| * | | | [JFFS2] fix race condition in jffs2_lzo_compress()Geert Uytterhoeven2008-11-051-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | deflate_mutex protects the globals lzo_mem and lzo_compress_buf. However, jffs2_lzo_compress() unlocks deflate_mutex _before_ it has copied out the compressed data from lzo_compress_buf. Correct this by moving the mutex unlock after the copy. In addition, document what deflate_mutex actually protects. Cc: stable@kernel.org Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Richard Purdie <rpurdie@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | [MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4)Eric W. Biederman2008-11-053-22/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For "unlock" cycles to 16bit devices in 8bit compatibility mode we need to use the byte addresses 0xaaa and 0x555. These effectively match the word address 0x555 and 0x2aa, except the latter has its low bit set. Most chips don't care about the value of the 'A-1' pin in x8 mode, but some -- like the ST M29W320D -- do. So we need to be careful to set it where appropriate. cfi_send_gen_cmd is only ever passed addresses where the low byte is 0x00, 0x55 or 0xaa. Of those, only addresses ending 0xaa are affected by this patch, by masking in the extra low bit when the device is known to be in compatibility mode. [dwmw2: Do it only when (cmd_ofs & 0xff) == 0xaa] v4: Fix stupid typo in cfi_build_cmd_addr that failed to compile I'm writing this patch way to late at night. v3: Bring all of the work back into cfi_build_cmd_addr including calling of map_bankwidth(map) and cfi_interleave(cfi) So every caller doesn't need to. v2: Only modified the address if we our device_type is larger than our bus width. Cc: stable@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | [JFFS2] Fix lack of locking in thread_should_wake()David Woodhouse2008-10-311-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The thread_should_wake() function trawls through the list of 'very dirty' eraseblocks, determining whether the background GC thread should wake. Doing this without holding the appropriate locks is a bad idea. OLPC Trac #8615 Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@kernel.org
| * | | | [JFFS2] Fix build failure with !CONFIG_JFFS2_FS_WRITEBUFFERSteve Glendinning2008-10-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Build failure introduced by 5bf1723723487ddb0b9c9641b6559da96b27cc93 [JFFS2] Write buffer offset adjustment for NOR-ECC (Sibley) flash Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * | | | [MTD] [NAND] OMAP2: remove duplicated #includeHuang Weiyi2008-10-211-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Removed duplicated #include <asm/arch/gpmc.h> in drivers/mtd/onenand/omap2.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* | | | | fat: i_blocks warning fixOGAWA Hirofumi2008-11-064-6/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkcnt_t type depends on CONFIG_LSF. Use unsigned long long always for printk(). But lazy to type it, so add "llu" and use it. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | fat: ->i_pos race fixOGAWA Hirofumi2008-11-061-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | i_pos is 64bits value, hence it's not atomic to update. Important place is fat_write_inode() only, other places without lock are just for printk(). This adds lock for "BITS_PER_LONG == 32" kernel. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | fat: mmu_private race fixOGAWA Hirofumi2008-11-064-10/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mmu_private is 64bits value, hence it's not atomic to update. So, the access rule for mmu_private is we must hold ->i_mutex. But, fat_get_block() path doesn't follow the rule on non-allocation path. This fixes by using i_size instead if non-allocation path. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | fat: Add printf attribute to fat_fs_panic()OGAWA Hirofumi2008-11-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>