summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [SCSI] mvsas: add support for 94xx; layout change; bug fixesAndy Yan2009-05-2011-1377/+3893
| | | | | | | | | | | | | | | | | | This version contains following main changes - Switch to new layout to support more types of ASIC. - SSP TMF supported and related Error Handing enhanced. - Support flash feature with delay 2*HZ when PHY changed. - Support Marvell 94xx series ASIC for 6G SAS/SATA, which has 2 88SE64xx chips but any different register description. - Support SPI flash for HBA-related configuration info. - Other patch enhanced from kernel side such as increasing PHY type [jejb: fold back in DMA_BIT_MASK changes] Signed-off-by: Ying Chu <jasonchu@marvell.com> Signed-off-by: Andy Yan <ayan@marvell.com> Signed-off-by: Ke Wei <kewei@marvell.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] mvsas: split driver into multiple filesJeff Garzik2009-05-208-2112/+2273
| | | | | | | | Split mvsas driver into multiple source codes, based on the split and function distribution found in Marvell's mvsas update. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] mvsas: move into new directory drivers/scsi/mvsas/Jeff Garzik2009-05-205-11/+63
| | | | | | | | | | | Zero functional changes, just file movement. This commit prepares for the upcoming integration of the Marvell-provided driver update that splits the driver into support for both 64xx and 94xx chip families. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Update version number to 8.03.01-k2.Andrew Vasquez2009-05-201-1/+1
| | | | | Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Use port number to compute nvram/vpd parameter offsets.Anirban Chakraborty2009-05-204-18/+30
| | | | | | | | | | Read adapter's physical port number from interrupt pin register and use it instead of pci function number to offset into the nvram to obtain the port's configuration parameters. Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Add an override option to specify ISP firmware load semantics.Andrew Vasquez2009-05-203-0/+17
| | | | | | | | As it may be useful during debugging to use a specific firmware image. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Don't try to 'stop' firmware if already in ROM code.Andrew Vasquez2009-05-202-1/+3
| | | | | Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Conditionally disable automatic queue full tracking.Michael Reed2009-05-203-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | Changing a lun's queue depth (/sys/block/sdX/device/queue_depth) isn't sticky when the device is connected via a QLogic fibre channel adapter. The QLogic qla2xxx fibre channel driver dynamically adjusts a lun's queue depth. If a user has a specific need to limit the number of commands issued to a lun (say a tape drive, or a shared raid where the total commands issued to all luns is limited at the controller level, for example) and writes a limiting value to /sys/block/sdXX/device/queue_depth, the qla2xxx driver will silently and gradually increase the queue depth back to the driver limit of ql2xmaxqdepth. While reducing this value (module parameter) or increasing the interval between ramp ups (ql2xqfullrampup) offers the potential for a work around it would be better to have the option of just disabling the dynamic adjustment of queue depth. This patch implements an "off switch" as a module parameter. Signed-off-by: Michael Reed <mdr@sgi.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Perform an implicit login to the Management Server.Joe Carnuccio2009-05-201-1/+1
| | | | | | | | | Set the conditional plogi option bit whenever logging in the fabric management server (if it is already logged in, it does not need an explicit login; an implicit login suffices). Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Restrict model-name/description device-table usage.Andrew Vasquez2009-05-201-2/+5
| | | | | | | | Information present in static table is only valid for pre-ISP25xx adapters. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Correct hard-coded address of a second-port's NVRAM.Harish Zunjarrao2009-05-201-1/+1
| | | | | | Signed-off-by: Harish Zunjarrao <harish.zunjarrao@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Correct typo in read_nvram() callback.Andrew Vasquez2009-05-201-1/+1
| | | | | Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Remove reference to request queue from scsi request block.Anirban Chakraborty2009-05-205-25/+27
| | | | | | | | | | srbs used to maintain a reference to the request queue on which it was enqueued. This is no longer required as the request queue pointer is now maintained in the scsi host that issues the srb. Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Add CPU affinity support.Anirban Chakraborty2009-05-207-9/+140
| | | | | | | | | | | | | | Set the module parameter ql2xmultique_tag to 1 to enable this feature. In this mode, the total number of response queues created is equal to the number of online cpus. Turning the block layer's rq_affinity mode on enables requests to be routed to the proper cpu and at the same time it enables completion of the IO in a response queue that is affined to the cpu in the request path. Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Add QoS support.Anirban Chakraborty2009-05-2011-368/+333
| | | | | | | | | | | | Set the number of request queues to the module paramater ql2xmaxqueues. Each vport gets a request queue. The QoS value set to the request queues determines priority control for queued IOs. If QoS value is not specified, the vports use the default queue 0. Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Correct compilation failures when DEBUG'n' options are enabled.Andrew Vasquez2009-05-203-14/+17
| | | | | Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Export additional FCoE attributes for application support.Andrew Vasquez2009-05-203-1/+50
| | | | | | | | Cull and export VN_Port MAC address and VLAN_ID information on supported FCoE ISPs. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* [SCSI] qla2xxx: Correct bus-reset behaviour with recent ISPs.Seokmann Ju2009-05-202-4/+2
| | | | | | | | | | | | | The short-circuit to skip the non-applicable 'full-login-lip' process on 81xx ISPs was nested too deeply in the 'bus-reset' routine, as the code in qla2x00_loop_reset() should skip the whole enable_lip_full_login process. The original code could cause device tear-down due to the qla2x00_wait_for_loop_ready() call taking a large amount of time. Signed-off-by: Seokmann Ju <seokmann.ju@qlogic.com> Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* Merge branch 'scsi-fixes' into merge-baseJames Bottomley2009-05-2045-11/+9213
|\
| * [SCSI] mpt2sas: fix driver version inconsistencyEric Moore2009-05-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In Commit commit 3b8b5c9b1f08660583e5dfe095c24170df62f1d2 Author: Eric Moore <eric.moore@lsi.com> Date: Tue Apr 21 15:44:27 2009 -0600 [SCSI] mpt2sas : bump driver version to 01.100.02.00 The MPT2SAS_MAJOR_VERSION didn't get bumped from 00 to 01 so applications will see it incorrectly as 00.100.02.00 driver instead of 01.100.02.00. Fix by making MPT2SAS_MAJOR_VERSION match the major number in MPT2SAS_DRIVER_VERSION Signed-off-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * [SCSI] 3w-xxxx: scsi_dma_unmap fixadam radford2009-05-152-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following regression that occurred during the scsi_dma_map()/unmap() changes when compiling with CONFIG_DMA_API_DEBUG=y : WARNING: at lib/dma-debug.c:496 check_unmap+0x142/0x542() Hardware name: 3w-xxxx 0000:02:02.0: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=36 bytes] Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * [SCSI] 3w-9xxx: scsi_dma_unmap fixadam radford2009-05-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following regression the occurred during the scsi_dma_map()/unmap() changes: 3w-9xxx 0001:45:00.0: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=36 bytes] Signed-off-by: Adam Radford <aradford@gmail.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * [SCSI] ses: fix problems caused by empty SES provided nameYinghai Lu2009-05-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | We use the name provided by SES to name objects. An empty name is legal in SES but causes problems in our generic device hierarchy. Fix this by falling back to a number if the name is either NULL or empty. Also fix a secondary bug spotted in that dev_set_name(dev, name) uses a string format and so would go wrong if name contained a '%'. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * [SCSI] fc-transport: Close state transition-window during rport deletion.Andrew Vasquez2009-05-152-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrew Vasquez wrote: > fc-transport: Close state transition-window during rport deletion. > > After an rport's state has transitioned to FC_PORTSTATE_BLOCKED, > but, prior to making the upcall to 'block' the scsi-target > associated with an rport, queued commands can recycle and > ultimately run out of retries causing failures to propagate to > upper-level drivers. Close this transition-window by returning > the non-'retries' modifying DID_IMM_RETRY status for submitted > I/Os. The same can happen for iscsi when transitioning from logged in to failed and blocking the sdevs. This patch converts iscsi and fc's transitions back to use DID_IMM_RETRY instead of DID_TRANSPORT_DISRUPTED which has a limited number of retries that we do not want to use for handling this race. Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> [Addition of iscsi and fc port online devloss case conversion by Mike Christie] Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * [SCSI] initialize max_target_blocked in scsi_alloc_targetEdward Goggin2009-05-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch initializes the max_target_blocked field of a scsi target structure so that a queuecommand return value of SCSI_MLQUEUE_TARGET_BUSY will actually result in having the scsi_queue_insert blocking the device queue before requeuing the command and running the queue. Otherwise, can and does cause livelock on single CPU configurations if/when open-iSCSI software initiator's command PDU window fills. Signed-off-by: Ed Goggin <egoggin@vmware.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
| * [SCSI] fnic: Add new Cisco PCI-Express FCoE HBAAbhijeet Joglekar2009-05-1337-0/+9199
| | | | | | | | | | | | | | | | | | fnic is a driver for the Cisco PCI-Express FCoE HBA Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* | Merge branch 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblazeLinus Torvalds2009-05-192-20/+34
|\ \ | | | | | | | | | | | | | | | * 'fixes-for-linus' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Fix kind-of-intr checking against number of interrupts microblaze: Update Microblaze defconfig
| * | microblaze: Fix kind-of-intr checking against number of interruptsMichal Simek2009-05-181-2/+2
| | | | | | | | | | | | | | | | | | + Fix typographic fault. Signed-off-by: Michal Simek <monstr@monstr.eu>
| * | microblaze: Update Microblaze defconfigMichal Simek2009-05-181-18/+32
| | | | | | | | | | | | Signed-off-by: Michal Simek <monstr@monstr.eu>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2009-05-191-1/+1
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: regulator: da903x: add missing __devexit_p()
| * | | regulator: da903x: add missing __devexit_p()Mike Frysinger2009-05-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The remove function uses __devexit, so the .remove assignment needs __devexit_p() to fix a build error with hotplug disabled. Signed-off-by: Mike Frysinger <vapier@gentoo.org> CC: Liam Girdwood <lrg@slimlogic.co.uk> CC: Mike Rapoport <mike@compulab.co.il> CC: Eric Miao <eric.miao@marvell.com> Acked-by: Eric Miao <eric.y.miao@gmail.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
* | | | Avoid ICE in get_random_int() with gcc-3.4.5Linus Torvalds2009-05-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Martin Knoblauch reports that trying to build 2.6.30-rc6-git3 with RHEL4.3 userspace (gcc (GCC) 3.4.5 20051201 (Red Hat 3.4.5-2)) causes an internal compiler error (ICE): drivers/char/random.c: In function `get_random_int': drivers/char/random.c:1672: error: unrecognizable insn: (insn 202 148 150 0 /scratch/build/linux-2.6.30-rc6-git3/arch/x86/include/asm/tsc.h:23 (set (reg:SI 0 ax [91]) (subreg:SI (plus:DI (plus:DI (reg:DI 0 ax [88]) (subreg:DI (reg:SI 6 bp) 0)) (const_int -4 [0xfffffffffffffffc])) 0)) -1 (nil) (nil)) drivers/char/random.c:1672: internal compiler error: in extract_insn, at recog.c:2083 and after some debugging it turns out that it's due to the code trying to figure out the rough value of the current stack pointer by taking an address of an uninitialized variable and casting that to an integer. This is clearly a compiler bug, but it's not worth fighting - while the current stack kernel pointer might be somewhat hard to predict in user space, it's also not generally going to change for a lot of the call chains for a particular process. So just drop it, and mumble some incoherent curses at the compiler. Tested-by: Martin Knoblauch <spamtrap@knobisoft.de> Cc: Matt Mackall <mpm@selenic.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | nfs: Fix NFS v4 client handling of MAY_EXEC in nfs_permission.Frank Filz2009-05-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem is that permission checking is skipped if atomic open is possible, but when exec opens a file, it just opens it O_READONLY which means EXEC permission will not be checked at that time. This problem is observed by the following sequence (executed as root): mount -t nfs4 server:/ /mnt4 echo "ls" >/mnt4/foo chmod 744 /mnt4/foo su guest -c "mnt4/foo" Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org Tested-by: Eugene Teo <eugeneteo@kernel.sg> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'merge' of ↵Linus Torvalds2009-05-184-55/+76
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Explicit alignment for .data.cacheline_aligned powerpc/ps3: Update ps3_defconfig powerpc/ftrace: Fix constraint to be early clobber powerpc/ftrace: Use pr_devel() in ftrace.c powerpc: Do not assert pte_locked for hugepage PTE entries
| * | | | powerpc: Explicit alignment for .data.cacheline_alignedBenjamin Herrenschmidt2009-05-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I don't think anything guarantees that the objects in data.page_aligned are a multiple of PAGE_SIZE, thus the section may end on any boundary. So the following section, .data.cacheline_aligned needs an explicit alignment. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc/ps3: Update ps3_defconfigGeoff Levand2009-05-181-43/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refresh and set these options: CONFIG_SYSFS_DEPRECATED_V2: y -> n CONFIG_INPUT_JOYSTICK: y -> n CONFIG_HID_SONY: n -> m CONFIG_RTC_DRV_PS3: - -> m Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc/ftrace: Fix constraint to be early clobberSteven Rostedt2009-05-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function graph tracer broke. This was discovered on my x86 boxes. The issue is that gcc used the same register for an output as it did for an input in an asm statement. I first thought this was a bug in gcc and reported it. I was notified that gcc was correct and that the output had to be flagged as an "early clobber". I noticed that powerpc had the same issue and this patch fixes it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc/ftrace: Use pr_devel() in ftrace.cMichael Ellerman2009-05-181-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pr_debug() can now result in code being generated even when #DEBUG is not defined. That's not really desirable in the ftrace code which we want to be snappy. With CONFIG_DYNAMIC_DEBUG=y: size before: text data bss dec hex filename 3334 672 4 4010 faa arch/powerpc/kernel/ftrace.o size after: text data bss dec hex filename 2616 360 4 2980 ba4 arch/powerpc/kernel/ftrace.o Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * | | | powerpc: Do not assert pte_locked for hugepage PTE entriesMel Gorman2009-05-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With CONFIG_DEBUG_VM, an assertion is made when changing the protection flags of a PTE that the PTE is locked. Huge pages use a different pagetable format and the assertion is bogus and will always trigger with a bug looking something like Unable to handle kernel paging request for data at address 0xf1a00235800006f8 Faulting instruction address: 0xc000000000034a80 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=32 NUMA Maple Modules linked in: dm_snapshot dm_mirror dm_region_hash dm_log dm_mod loop evdev ext3 jbd mbcache sg sd_mod ide_pci_generic pata_amd ata_generic ipr libata tg3 libphy scsi_mod windfarm_pid windfarm_smu_sat windfarm_max6690_sensor windfarm_lm75_sensor windfarm_cpufreq_clamp windfarm_core i2c_powermac NIP: c000000000034a80 LR: c000000000034b18 CTR: 0000000000000003 REGS: c000000003037600 TRAP: 0300 Not tainted (2.6.30-rc3-autokern1) MSR: 9000000000009032 <EE,ME,IR,DR> CR: 28002484 XER: 200fffff DAR: f1a00235800006f8, DSISR: 0000000040010000 TASK = c0000002e54cc740[2960] 'map_high_trunca' THREAD: c000000003034000 CPU: 2 GPR00: 4000000000000000 c000000003037880 c000000000895d30 c0000002e5a2e500 GPR04: 00000000a0000000 c0000002edc40880 0000005700000393 0000000000000001 GPR08: f000000011ac0000 01a00235800006e8 00000000000000f5 f1a00235800006e8 GPR12: 0000000028000484 c0000000008dd780 0000000000001000 0000000000000000 GPR16: fffffffffffff000 0000000000000000 00000000a0000000 c000000003037a20 GPR20: c0000002e5f4ece8 0000000000001000 c0000002edc40880 0000000000000000 GPR24: c0000002e5f4ece8 0000000000000000 00000000a0000000 c0000002e5f4ece8 GPR28: 0000005700000393 c0000002e5a2e500 00000000a0000000 c000000003037880 NIP [c000000000034a80] .assert_pte_locked+0xa4/0xd0 LR [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4 Call Trace: [c000000003037880] [c000000003037990] 0xc000000003037990 (unreliable) [c000000003037910] [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4 [c0000000030379b0] [c00000000014bef8] .hugetlb_cow+0x124/0x674 [c000000003037b00] [c00000000014c930] .hugetlb_fault+0x4e8/0x6f8 [c000000003037c00] [c00000000013443c] .handle_mm_fault+0xac/0x828 [c000000003037cf0] [c0000000000340a8] .do_page_fault+0x39c/0x584 [c000000003037e30] [c0000000000057b0] handle_page_fault+0x20/0x5c Instruction dump: 7d29582a 7d200074 7800d182 0b000000 3c004000 3960ffff 780007c6 796b00c4 7d290214 7929a302 1d290068 7d6b4a14 <800b0010> 7c000074 7800d182 0b000000 This patch fixes the problem by not asseting the PTE is locked for VMAs backed by huge pages. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| | | | |
| \ \ \ \
*-. \ \ \ \ Merge branches 'sched-fixes-for-linus-2' and 'core-fixes-for-linus-2' of ↵Linus Torvalds2009-05-182-3/+4
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix fallback sched_clock()'s offset when using jiffies * 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: increase MAX_LOCKDEP_ENTRIES and MAX_LOCKDEP_CHAINS
| | * | | | | lockdep: increase MAX_LOCKDEP_ENTRIES and MAX_LOCKDEP_CHAINSIngo Molnar2009-05-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that lockdep coverage has increased it has become easier to run out of entries: [ 21.401387] BUG: MAX_LOCKDEP_ENTRIES too low! [ 21.402007] turning off the locking correctness validator. [ 21.402007] Pid: 1555, comm: S99local Not tainted 2.6.30-rc5-tip #2 [ 21.402007] Call Trace: [ 21.402007] [<ffffffff81069789>] add_lock_to_list+0x53/0xba [ 21.402007] [<ffffffff810eb615>] ? lookup_mnt+0x19/0x53 [ 21.402007] [<ffffffff8106be14>] check_prev_add+0x14b/0x1c7 [ 21.402007] [<ffffffff8106c304>] validate_chain+0x474/0x52a [ 21.402007] [<ffffffff8106c6fc>] __lock_acquire+0x342/0x3c7 [ 21.402007] [<ffffffff8106c842>] lock_acquire+0xc1/0xe5 [ 21.402007] [<ffffffff810eb615>] ? lookup_mnt+0x19/0x53 [ 21.402007] [<ffffffff8153aedc>] _spin_lock+0x31/0x66 Double the size - as we've done in the past. [ Impact: allow lockdep to cover more locks ] Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | sched: Fix fallback sched_clock()'s offset when using jiffiesRon2009-05-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Account for the initial offset to the jiffy count. [ Impact: fix printk timestamps on architectures using fallback sched_clock() ] Signed-off-by: Ron Lee <ron@debian.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2009-05-1813-24/+58
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Fix performance regression caused by paravirt_ops on native kernels xen: use header for EXPORT_SYMBOL_GPL x86, 32-bit: fix kernel_trap_sp() x86: fix percpu_{to,from}_op() x86: mtrr: Fix high_width computation when phys-addr is >= 44bit x86: Fix false positive section mismatch warnings in the apic code
| * | | | | | | x86: Fix performance regression caused by paravirt_ops on native kernelsJeremy Fitzhardinge2009-05-157-10/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Xiaohui Xin and some other folks at Intel have been looking into what's behind the performance hit of paravirt_ops when running native. It appears that the hit is entirely due to the paravirtualized spinlocks introduced by: | commit 8efcbab674de2bee45a2e4cdf97de16b8e609ac8 | Date: Mon Jul 7 12:07:51 2008 -0700 | | paravirt: introduce a "lock-byte" spinlock implementation The extra call/return in the spinlock path is somehow causing an increase in the cycles/instruction of somewhere around 2-7% (seems to vary quite a lot from test to test). The working theory is that the CPU's pipeline is getting upset about the call->call->locked-op->return->return, and seems to be failing to speculate (though I haven't seen anything definitive about the precise reasons). This doesn't entirely make sense, because the performance hit is also visible on unlock and other operations which don't involve locked instructions. But spinlock operations clearly swamp all the other pvops operations, even though I can't imagine that they're nearly as common (there's only a .05% increase in instructions executed). If I disable just the pv-spinlock calls, my tests show that pvops is identical to non-pvops performance on native (my measurements show that it is actually about .1% faster, but Xiaohui shows a .05% slowdown). Summary of results, averaging 10 runs of the "mmperf" test, using a no-pvops build as baseline: nopv Pv-nospin Pv-spin CPU cycles 100.00% 99.89% 102.18% instructions 100.00% 100.10% 100.15% CPI 100.00% 99.79% 102.03% cache ref 100.00% 100.84% 100.28% cache miss 100.00% 90.47% 88.56% cache miss rate 100.00% 89.72% 88.31% branches 100.00% 99.93% 100.04% branch miss 100.00% 103.66% 107.72% branch miss rt 100.00% 103.73% 107.67% wallclock 100.00% 99.90% 102.20% The clear effect here is that the 2% increase in CPI is directly reflected in the final wallclock time. (The other interesting effect is that the more ops are out of line calls via pvops, the lower the cache access and miss rates. Not too surprising, but it suggests that the non-pvops kernel is over-inlined. On the flipside, the branch misses go up correspondingly...) So, what's the fix? Paravirt patching turns all the pvops calls into direct calls, so _spin_lock etc do end up having direct calls. For example, the compiler generated code for paravirtualized _spin_lock is: <_spin_lock+0>: mov %gs:0xb4c8,%rax <_spin_lock+9>: incl 0xffffffffffffe044(%rax) <_spin_lock+15>: callq *0xffffffff805a5b30 <_spin_lock+22>: retq The indirect call will get patched to: <_spin_lock+0>: mov %gs:0xb4c8,%rax <_spin_lock+9>: incl 0xffffffffffffe044(%rax) <_spin_lock+15>: callq <__ticket_spin_lock> <_spin_lock+20>: nop; nop /* or whatever 2-byte nop */ <_spin_lock+22>: retq One possibility is to inline _spin_lock, etc, when building an optimised kernel (ie, when there's no spinlock/preempt instrumentation/debugging enabled). That will remove the outer call/return pair, returning the instruction stream to a single call/return, which will presumably execute the same as the non-pvops case. The downsides arel 1) it will replicate the preempt_disable/enable code at eack lock/unlock callsite; this code is fairly small, but not nothing; and 2) the spinlock definitions are already a very heavily tangled mass of #ifdefs and other preprocessor magic, and making any changes will be non-trivial. The other obvious answer is to disable pv-spinlocks. Making them a separate config option is fairly easy, and it would be trivial to enable them only when Xen is enabled (as the only non-default user). But it doesn't really address the common case of a distro build which is going to have Xen support enabled, and leaves the open question of whether the native performance cost of pv-spinlocks is worth the performance improvement on a loaded Xen system (10% saving of overall system CPU when guests block rather than spin). Still it is a reasonable short-term workaround. [ Impact: fix pvops performance regression when running native ] Analysed-by: "Xin Xiaohui" <xiaohui.xin@intel.com> Analysed-by: "Li Xin" <xin.li@intel.com> Analysed-by: "Nakajima Jun" <jun.nakajima@intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Xen-devel <xen-devel@lists.xensource.com> LKML-Reference: <4A0B62F7.5030802@goop.org> [ fixed the help text ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | xen: use header for EXPORT_SYMBOL_GPLRandy Dunlap2009-05-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mmu.c needs to #include module.h to prevent these warnings: arch/x86/xen/mmu.c:239: warning: data definition has no type or storage class arch/x86/xen/mmu.c:239: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL' arch/x86/xen/mmu.c:239: warning: parameter names (without types) in function declaration [ Impact: cleanup ] Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | x86, 32-bit: fix kernel_trap_sp()Masami Hiramatsu2009-05-122-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use &regs->sp instead of regs for getting the top of stack in kernel mode. (on x86-64, regs->sp always points the top of stack) [ Impact: Oprofile decodes only stack for backtracing on i386 ] Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> [ v2: rename the API to kernel_stack_pointer(), move variable inside ] Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: systemtap@sources.redhat.com Cc: Harvey Harrison <harvey.harrison@gmail.com> Cc: Jan Blunck <jblunck@suse.de> Cc: Christoph Hellwig <hch@infradead.org> LKML-Reference: <20090511210300.17332.67549.stgit@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | x86: fix percpu_{to,from}_op()Jan Beulich2009-05-111-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - the byte operand constraints were wrong for 32-bit - the to-op's input operands weren't properly parenthesized [ Impact: fix possible miscompilation or build failure ] Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | | | | | x86: mtrr: Fix high_width computation when phys-addr is >= 44bitYinghai Lu2009-05-111-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | found one system where cpu address line is 44bits, mtrr printout is not right: [ 0.000000] MTRR variable ranges enabled: [ 0.000000] 0 base 0 00000000 mask FF0 00000000 write-back [ 0.000000] 1 base 10 00000000 mask FFF 80000000 write-back [ 0.000000] 2 base 0 80000000 mask FFF 80000000 uncachable [ 0.000000] 3 base 0 7F800000 mask FFF FF800000 uncachable Li Zefan and Frederic pointed out the high_width could be -4 some how. It turns out when phys_addr is 44bit, size_or_mask will be ffffffff,00000000 so ffs(size_or_mask) will be 0. Try to check low 32 bit, to get correct high_width. Signed-off-by: Yinghai Lu <yinghai@kerne.org> Also-analyzed-by: Frederic Weisbecker <fweisbec@gmail.com> Also-analyzed-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Zhaolei <zhaolei@cn.fujitsu.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> LKML-Reference: <4A026540.8060504@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | | x86: Fix false positive section mismatch warnings in the apic codeSam Ravnborg2009-05-101-4/+4
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Impact: reduce kernel image size a bit, annotate away warnings ] Signed-off-by: Sam Ravnborg <sam@ravnborg.org> [ modified and tested it ] Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: Marcin Slusarz <marcin.slusarz@gmail.com> LKML-Reference: <b9df5fa10905090235s4bfd26a8o979f93809c9727ad@mail.gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | Merge branch 'tracing-fixes-for-linus' of ↵Linus Torvalds2009-05-182-2/+2
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Append prompt in /debug/tracing/README file x86/function-graph: fix constraint for recording old return value