summaryrefslogtreecommitdiffstats
path: root/arch
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds2009-12-0525-415/+796
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) x86, apic: Enable lapic nmi watchdog on AMD Family 11h x86: Remove unnecessary mdelay() from cpu_disable_common() x86, ioapic: Document another case when level irq is seen as an edge x86, ioapic: Fix the EOI register detection mechanism x86, io-apic: Move the effort of clearing remoteIRR explicitly before migrating the irq x86: SGI UV: Map low MMR ranges x86: apic: Print out SRAT table APIC id in hex x86: Re-get cfg_new in case reuse/move irq_desc x86: apic: Remove not needed #ifdef x86: io-apic: IO-APIC MMIO should not fail on resource insertion x86: Remove asm/apicnum.h x86: apic: Do not use stacked physid_mask_t x86, apic: Get rid of apicid_to_cpu_present assign on 64-bit x86, ioapic: Use snrpintf while set names for IO-APIC resourses x86, apic: Use PAGE_SIZE instead of numbers x86: Remove local_irq_enable()/local_irq_disable() in fixup_irqs() x86: Use EOI register in io-apic on intel platforms x86: Force irq complete move during cpu offline x86: Remove move_cleanup_count from irq_cfg x86, intr-remap: Avoid irq_chip mask/unmask in fixup_irqs() for intr-remapping ...
| * x86, apic: Enable lapic nmi watchdog on AMD Family 11hMikael Pettersson2009-12-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The x86 lapic nmi watchdog does not recognize AMD Family 11h, resulting in: NMI watchdog: CPU not supported As far as I can see from available documentation (the BKDM), family 11h looks identical to family 10h as far as the PMU is concerned. Extending the check to accept family 11h results in: Testing NMI watchdog ... OK. I've been running with this change on a Turion X2 Ultra ZM-82 laptop for a couple of weeks now without problems. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: <stable@kernel.org> LKML-Reference: <19223.53436.931768.278021@pilspetsen.it.uu.se> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: Remove unnecessary mdelay() from cpu_disable_common()Suresh Siddha2009-12-021-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | fixup_irqs() already has a mdelay(). Remove the extra and unnecessary mdelay() from cpu_disable_common(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.232177348@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, ioapic: Document another case when level irq is seen as an edgeSuresh Siddha2009-12-021-1/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case when cpu goes offline, fixup_irqs() will forward any unhandled interrupt on the offlined cpu to the new cpu destination that is handling the corresponding interrupt. This interrupt forwarding is done via IPI's. Hence, in this case also level-triggered io-apic interrupt will be seen as an edge interrupt in the cpu's APIC IRR. Document this scenario in the code which handles this case by doing an explicit EOI to the io-apic to clear remote IRR of the io-apic RTE. Requested-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.143970505@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, ioapic: Fix the EOI register detection mechanismSuresh Siddha2009-12-021-54/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Maciej W. Rozycki reported: > 82093AA I/O APIC has its version set to 0x11 and it > does not support the EOI register. Similarly I/O APICs > integrated into the 82379AB south bridge and the 82374EB/SB > EISA component. IO-APIC versions below 0x20 don't support EOI register. Some of the Intel ICH Specs (ICH2 to ICH5) documents the io-apic version as 0x2. This is an error with documentation and these ICH chips use io-apic's of version 0x20 and indeed has a working EOI register for the io-apic. Fix the EOI register detection mechanism to check for version 0x20 and beyond. And also, a platform can potentially have io-apic's with different versions. Make the EOI register check per io-apic. Reported-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233335.065361533@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, io-apic: Move the effort of clearing remoteIRR explicitly before ↵Maciej W. Rozycki2009-12-021-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | migrating the irq When the level-triggered interrupt is seen as an edge interrupt, we try to clear the remoteIRR explicitly (using either an io-apic eoi register when present or through the idea of changing trigger mode of the io-apic RTE to edge and then back to level). But this explicit try also needs to happen before we try to migrate the irq. Otherwise irq migration attempt will fail anyhow, as it postpones the irq migration to a later attempt when it sees the remoteIRR in the io-apic RTE still set. Signed-off-by: "Maciej W. Rozycki" <macro@linux-mips.org> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: ebiederm@xmission.com Cc: garyhade@us.ibm.com LKML-Reference: <20091201233334.975416130@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: SGI UV: Map low MMR rangesJack Steiner2009-11-261-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Explicitly mmap the UV chipset MMR address ranges used to access blade-local registers. Although these same MMRs are also mmaped at higher addresses, the low range is more convenient when accessing blade-local registers. The low range addresses always alias to the local blade regardless of the blade id. Signed-off-by: Jack Steiner <steiner@sgi.com> LKML-Reference: <20091125162018.GA25445@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: apic: Print out SRAT table APIC id in hexYinghai Lu2009-11-231-2/+2
| | | | | | | | | | | | | | | | | | Make it consistent with APIC MADT print out, for big systems APIC id in hex is more readable. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: Re-get cfg_new in case reuse/move irq_descYinghai Lu2009-11-231-0/+1
| | | | | | | | | | | | | | | | When irq_desc is moved, we need to make sure to use the right cfg_new. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: apic: Remove not needed #ifdefYinghai Lu2009-11-231-6/+3
| | | | | | | | | | | | | | | | Suresh made dmar_table_init() already have that protection. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4B07A739.3030104@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: io-apic: IO-APIC MMIO should not fail on resource insertionCyrill Gorcunov2009-11-162-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If IO-APIC base address is 1K aligned we should not fail on resourse insertion procedure. For this sake we define IO_APIC_SLOT_SIZE constant which should cover all IO-APIC direct accessible registers. An example of a such configuration is there http://marc.info/?l=linux-kernel&m=118114792006520 | | Quoting the message | | IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23 | IOAPIC[1]: apic_id 3, version 32, address 0xfec80000, GSI 24-47 | IOAPIC[2]: apic_id 4, version 32, address 0xfec80400, GSI 48-71 | IOAPIC[3]: apic_id 5, version 32, address 0xfec84000, GSI 72-95 | IOAPIC[4]: apic_id 8, version 32, address 0xfec84400, GSI 96-119 | Reported-by: "Maciej W. Rozycki" <macro@linux-mips.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20091116151426.GC5653@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: Remove asm/apicnum.hDimitri Sivanich2009-11-101-12/+0
| | | | | | | | | | | | | | | | | | | | | | arch/x86/include/asm/apicnum.h is not referenced anywhere anymore. Its definitions appear in apicdef.h. Remove it. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Mike Travis <travis@sgi.com> LKML-Reference: <20091110195835.GA4393@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: apic: Do not use stacked physid_mask_tCyrill Gorcunov2009-11-1010-73/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should not use physid_mask_t as a stack based variable in apic code. This type depends on MAX_APICS parameter which may be huge enough. Especially it became a problem with apic NOOP driver which is portable between 32 bit and 64 bit environment (where we have really huge MAX_APICS). So apic driver should operate with pointers and a caller in turn should aware of allocation physid_mask_t variable. As a side (but positive) effect -- we may use already implemented physid_set_mask_of_physid function eliminating default_apicid_to_cpu_present completely. Note that physids_coerce and physids_promote turned into static inline from macro (since macro hides the fact that parameter is being interpreted as unsigned long, make it explicit). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Maciej W. Rozycki <macro@linux-mips.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> LKML-Reference: <20091109220659.GA5568@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, apic: Get rid of apicid_to_cpu_present assign on 64-bitCyrill Gorcunov2009-11-081-0/+5
| | | | | | | | | | | | | | | | | | | | | | In fact it's never get used on x86-64 (for 64 bit platform we use differ technique to enumerate io-units). Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20091108131645.GD5300@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, ioapic: Use snrpintf while set names for IO-APIC resoursesCyrill Gorcunov2009-11-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | We should be ready that one day MAX_IO_APICS may raise its number. To prevent memory overwrite we're to use safe snprintf while set IO-APIC resourse name. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <20091108155431.GC25940@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, apic: Use PAGE_SIZE instead of numbersCyrill Gorcunov2009-11-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The whole page is reserved for IO-APIC fixmap due to non-cacheable requirement. So lets note this explicitly instead of playing with numbers. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Maciej W. Rozycki <macro@linux-mips.org> LKML-Reference: <20091108155356.GB25940@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: Remove local_irq_enable()/local_irq_disable() in fixup_irqs()Suresh Siddha2009-11-021-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To ensure that we handle all the pending interrupts (destined for this cpu that is going down) in the interrupt subsystem before the cpu goes offline, fixup_irqs() does: local_irq_enable(); mdelay(1); local_irq_disable(); Enabling interrupts is not a good thing as this cpu is already offline. So this patch replaces that logic with, mdelay(1); check APIC_IRR bits Retrigger the irq at the new destination if any interrupt has arrived via IPI. For IO-APIC level triggered interrupts, this retrigger IPI will appear as an edge interrupt. ack_apic_level() will detect this condition and IO-APIC RTE's remoteIRR is cleared using directed EOI(using IO-APIC EOI register) on Intel platforms and for others it uses the existing mask+edge logic followed by unmask+level. We can also remove mdelay() and then send spuriuous interrupts to new cpu targets for all the irqs that were handled previously by this cpu that is going offline. While it works, I have seen spurious interrupt messages (nothing wrong but still annoying messages during cpu offline, which can be seen during suspend/resume etc) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230002.043281924@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: Use EOI register in io-apic on intel platformsSuresh Siddha2009-11-021-27/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | IO-APIC's in intel chipsets support EOI register starting from IO-APIC version 2. Use that when ever we need to clear the IO-APIC RTE's RemoteIRR bit explicitly. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230001.947855317@sbs-t61.sc.intel.com> [ Marked use_eio_reg as __read_mostly, fixed small details ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: Force irq complete move during cpu offlineSuresh Siddha2009-11-023-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a cpu goes offline, fixup_irqs() try to move irq's currently destined to the offline cpu to a new cpu. But this attempt will fail if the irq is recently moved to this cpu and the irq still hasn't arrived at this cpu (for non intr-remapping platforms this is when we free the vector allocation at the previous destination) that is about to go offline. This will endup with the interrupt subsystem still pointing the irq to the offline cpu, causing that irq to not work any more. Fix this by forcing the irq to complete its move (its been a long time we moved the irq to this cpu which we are offlining now) and then move this irq to a new cpu before this cpu goes offline. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230001.848830905@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: Remove move_cleanup_count from irq_cfgSuresh Siddha2009-11-022-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | move_cleanup_count for each irq in irq_cfg is keeping track of the total number of cpus that need to free the corresponding vectors associated with the irq which has now been migrated to new destination. As long as this move_cleanup_count is non-zero (i.e., as long as we have n't freed the vector allocations on the old destinations) we were preventing the irq's further migration. This cleanup count is unnecessary and it is enough to not allow the irq migration till we send the cleanup vector to the previous irq destination, for which we already have irq_cfg's move_in_progress. All we need to make sure is that we free the vector at the old desintation but we don't need to wait till that gets freed. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230001.752968906@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, intr-remap: Avoid irq_chip mask/unmask in fixup_irqs() for intr-remappingSuresh Siddha2009-11-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the presence of interrupt-remapping, irqs will be migrated in the process context and we don't do (and there is no need to) irq_chip mask/unmask while migrating the interrupt. Similarly fix the fixup_irqs() that get called during cpu offline and avoid calling irq_chip mask/unmask for irqs that are ok to be migrated in the process context. While we didn't observe any race condition with the existing code, this change takes complete advantage of interrupt-remapping in the newer generation platforms and avoids any potential HW lockup's (that often worry Eric :) Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Eric W. Biederman <ebiederm@xmission.com> Cc: garyhade@us.ibm.com LKML-Reference: <20091026230001.661423939@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: Unify fixup_irqs() for 32-bit and 64-bit kernelsSuresh Siddha2009-11-023-103/+59
| | | | | | | | | | | | | | | | | | | | | | | | There is no reason to have different fixup_irqs() for 32-bit and 64-bit kernels. Unify by using the superior 64-bit version for both the kernels. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Gary Hade <garyhade@us.ibm.com> Cc: Eric W. Biederman <ebiederm@xmission.com> LKML-Reference: <20091026230001.562512739@sbs-t61.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, apic: Clear APIC Timer Initial Count Register on shutdownAndreas Herrmann2009-10-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a98f8fd24fb24fcb9a359553e64dd6aac5cf4279 (x86: apic reset counter on shutdown) set the counter to max to avoid spurious interrupts when the timer is re-enabled. (In theory) you'll still get a spurious interrupt if spending more than 344 seconds with this interrupt disabled and then unmasking it. The right thing to do is to clear the register. This disables the interrupt from happening (at least it does on AMD hardware). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> LKML-Reference: <20091027100138.GB30802@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: apic: Allow noop operations to be called almost at any timeCyrill Gorcunov2009-10-152-47/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As only apic noop is used we allow to use almost any operation caller wants (and which of them noop driver supports of course). Initially it was reported by Ingo Molnar that apic noop issue a warning for pkg id (which is actually false positive and should be eliminated). So we save checking (and warning issue) for read/write operations while allow any other ops to be freely used. Also: - fix noop_cpu_to_logical_apicid, it should be 0. - rename noop_default_phys_pkg_id to noop_phys_pkg_id (we use default_ prefix for more general routines in apic subsystem). Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Maciej W. Rozycki <macro@linux-mips.org> LKML-Reference: <20091015150416.GC5331@lenovo> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, apic: Fix prototype in hw_irq.hIngo Molnar2009-10-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This warning: In file included from arch/x86/include/asm/ipi.h:23, from arch/x86/kernel/apic/apic_noop.c:27: arch/x86/include/asm/hw_irq.h:105: warning: ‘struct irq_desc’ declared inside parameter list arch/x86/include/asm/hw_irq.h:105: warning: its scope is only this definition or declaration, which is probably not what you want triggers because irq_desc is defined after hw_irq.h is included in irq.h. Since it's pointer reference only, a forward declaration of the type will solve the problem. LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, apic: Move SGI UV functionality out of generic IO-APIC codeDimitri Sivanich2009-10-144-154/+145
| | | | | | | | | | | | | | | | | | Move UV specific functionality out of the generic IO-APIC code. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20091013203236.GD20543@sgi.com> [ Cleaned up the code some more in their new places. ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86: SGI UV: Fix irq affinity for hub based interruptsDimitri Sivanich2009-10-143-15/+177
| | | | | | | | | | | | | | | | | | | | This patch fixes handling of uv hub irq affinity. IRQs with ALL or NODE affinity can be routed to cpus other than their originally assigned cpu. Those with CPU affinity cannot be rerouted. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> LKML-Reference: <20090930160259.GA7822@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, apic: Limit apic dumping, introduce new show_lapic= setup optionCyrill Gorcunov2009-10-141-15/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case if a system has a large number of cpus printing apics contents may consume a long time period. We limit such an output by 1 apic by default. But to have an ability to see all apics or some part of them we introduce "show_lapic" setup option which allow us to limit/unlimit the number of APICs being dumped. Example: apic=debug show_lapic=5, or apic=debug show_lapic=all Also move apic_verbosity checking upper that way so helper routines do not need to inspect it at all. Suggested-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.926793122@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, apic: Use apic noop driverCyrill Gorcunov2009-10-141-19/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In case if apic were disabled we may use the whole apic NOOP driver instead of sparse poking the some functions in apic driver. Also NOOP would catch any inappropriate apic operation calls (not just read/write). Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.747817361@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * x86, apic: Introduce the NOOP apic driverCyrill Gorcunov2009-10-143-1/+197
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce NOOP APIC driver. We should use it in case if apic was disabled due to hardware of software/firmware problems (including user requested to disable it case). The driver is attempting to catch any inappropriate apic operation call with warning issue. Also it is possible to use some apic operation like IPI calls, read/write without checking for apic presence which should make callers code easier. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: yinghai@kernel.org Cc: macro@linux-mips.org LKML-Reference: <20091013201022.534682104@openvz.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'perf-core-for-linus' of ↵Linus Torvalds2009-12-0574-521/+4559
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (470 commits) x86: Fix comments of register/stack access functions perf tools: Replace %m with %a in sscanf hw-breakpoints: Keep track of user disabled breakpoints tracing/syscalls: Make syscall events print callbacks static tracing: Add DEFINE_EVENT(), DEFINE_SINGLE_EVENT() support to docbook perf: Don't free perf_mmap_data until work has been done perf_event: Fix compile error perf tools: Fix _GNU_SOURCE macro related strndup() build error trace_syscalls: Remove unused syscall_name_to_nr() trace_syscalls: Simplify syscall profile trace_syscalls: Remove duplicate init_enter_##sname() trace_syscalls: Add syscall_nr field to struct syscall_metadata trace_syscalls: Remove enter_id exit_id trace_syscalls: Set event_enter_##sname->data to its metadata trace_syscalls: Remove unused event_syscall_enter and event_syscall_exit perf_event: Initialize data.period in perf_swevent_hrtimer() perf probe: Simplify event naming perf probe: Add --list option for listing current probe events perf probe: Add argv_split() from lib/argv_split.c perf probe: Move probe event utility functions to probe-event.c ...
| * \ Merge branch 'perf/probes' into perf/coreIngo Molnar2009-12-032-3/+35
| |\ \ | | | | | | | | | | | | | | | | | | | | Merge reason: add these fixes to 'perf probe'. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | x86: insn decoder test checks objdump versionMasami Hiramatsu2009-11-202-1/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Check objdump version before using it for insn decoder build test, because some older objdump can't decode AVX code correctly. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Jim Keniston <jkenisto@us.ibm.com> LKML-Reference: <20091120171314.6715.30390.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| | * | x86: Fix insn decoder test typosMasami Hiramatsu2009-11-201-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix postest_verbose to posttest_verbose, and add posttest_64bit option for CONFIG_64BIT != y, since old command just passed '-' instead of '-n' when CONFIG_64BIT is not set. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Jim Keniston <jkenisto@us.ibm.com> LKML-Reference: <20091120171307.6715.66099.stgit@dhcp-100-2-132.bos.redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | Merge branch 'perf/mce' into perf/coreIngo Molnar2009-12-037-54/+117
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: It's ready for v2.6.33. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | x86, mce: Add __cpuinit to hotplug callback functionsHidetoshi Seto2009-11-261-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The mce_disable_cpu() and mce_reenable_cpu() are called only from mce_cpu_callback() which is marked as __cpuinit. So these functions can be __cpuinit too. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <4B0E3C4E.4090809@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | x86: Tighten conditionals on MCE related statisticsJan Beulich2009-11-232-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | irq_thermal_count is only being maintained when X86_THERMAL_VECTOR, and both X86_THERMAL_VECTOR and X86_MCE_THRESHOLD don't need extra wrapping in X86_MCE conditionals. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Yong Wang <yong.y.wang@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <4B06AFA902000078000211F8@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | x86, mce: Fix __init annotationsHidetoshi Seto2009-11-122-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intel_init_thermal() is called from resume path, so it cannot be marked as __init. OTOH mce_banks_init() is only called from __mcheck_cpu_cap_init() which is marked as __cpuinit, so it can be also marked as __cpuinit. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Acked-by: Yong Wang <yong.y.wang@linux.intel.com> LKML-Reference: <4AFBB0B8.2070501@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | x86: Mark the thermal init functions __initYong Wang2009-11-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark the thermal init functions __init so that the init memory can be freed. Signed-off-by: Yong Wang <yong.y.wang@intel.com> LKML-Reference: <20091111075125.GA17900@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | x86: Under BIOS control, restore AP's APIC_LVTTHMR to the BSP valueYong Wang2009-11-105-5/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On platforms where the BIOS handles the thermal monitor interrupt, APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS must not touch it. Unfortunately AP bringup sequence using INIT-SIPI-SIPI clears all the LVT entries except the mask bit. Essentially this results in all LVT entries including the thermal monitoring interrupt set to masked (clearing the bios programmed value for APIC_LVTTHMR). And this leads to kernel take over the thermal monitoring interrupt on AP's but not on BSP (leaving the bios programmed value only on BSP). As a result of this, we have seen system hangs when the thermal monitoring interrupt is generated. Fix this by reading the initial value of thermal LVT entry on BSP and if bios has taken over the control, then program the same value on all AP's and leave the thermal monitoring interrupt control on all the logical cpu's to the bios. Signed-off-by: Yong Wang <yong.y.wang@intel.com> Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Arjan van de Ven <arjan@infradead.org> LKML-Reference: <20091110013824.GA24940@ywang-moblin2.bj.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: stable@kernel.org
| | * | | x86, mce: Add a global MCE init helperBorislav Petkov2009-10-161-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an early initcall (pre SMP) which sets up global MCE functionality. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <1255689093-26921-2-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | x86, mce: Fix up MCE naming nomenclatureBorislav Petkov2009-10-163-29/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prefix global/setup routines with "mcheck_" thus differentiating from the internal facilities prefixed with "mce_". Also, prefix the per cpu calls with mcheck_cpu and rename them to reflect the MCE setup hierarchy of calls better. There should be no functionality change resulting from this patch. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andi Kleen <andi@firstfloor.org> LKML-Reference: <1255689093-26921-1-git-send-email-borislav.petkov@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | |
| | | \ \
| | *-. \ \ Merge branches 'x86/mce' and 'x86/urgent' into perf/mceIngo Molnar2009-10-1670-365/+633
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge reason: Put all MCE changes into this branch, we are queueing up a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * | | | mce, edac: Use an atomic notifier for MCEs decodingBorislav Petkov2009-10-122-10/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an atomic notifier which ensures proper locking when conveying MCE info to EDAC for decoding. The actual notifier call overrides a default, negative priority notifier. Note: make sure we register the default decoder only once since mcheck_init() runs on each CPU. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20091003065752.GA8935@liondog.tnic> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | perf_event, x86, mce: Use TRACE_EVENT() for MCE loggingHidetoshi Seto2009-10-131-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This approach is the first baby step towards solving many of the structural problems the x86 MCE logging code is having today: - It has a private ring-buffer implementation that has a number of limitations and has been historically fragile and buggy. - It is using a quirky /dev/mcelog ioctl driven ABI that is MCE specific. /dev/mcelog is not part of any larger logging framework and hence has remained on the fringes for many years. - The MCE logging code is still very unclean partly due to its ABI limitations. Fields are being reused for multiple purposes, and the whole message structure is limited and x86 specific to begin with. All in one, the x86 tree would like to move away from this private implementation of an event logging facility to a broader framework. By using perf events we gain the following advantages: - Multiple user-space agents can access MCE events. We can have an mcelog daemon running but also a system-wide tracer capturing important events in flight-recorder mode. - Sampling support: the kernel and the user-space call-chain of MCE events can be stored and analyzed as well. This way actual patterns of bad behavior can be matched to precisely what kind of activity happened in the kernel (and/or in the app) around that moment in time. - Coupling with other hardware and software events: the PMU can track a number of other anomalies - monitoring software might chose to monitor those plus the MCE events as well - in one coherent stream of events. - Discovery of MCE sources - tracepoints are enumerated and tools can act upon the existence (or non-existence) of various channels of MCE information. - Filtering support: we just subscribe to and act upon the events we are interested in. Then even on a per event source basis there's in-kernel filter expressions available that can restrict the amount of data that hits the event channel. - Arbitrary deep per cpu buffering of events - we can buffer 32 entries or we can buffer as much as we want, as long as we have the RAM. - An NMI-safe ring-buffer implementation - mappable to user-space. - Built-in support for timestamping of events, PID markers, CPU markers, etc. - A rich ABI accessible over system call interface. Per cpu, per task and per workload monitoring of MCE events can be done this way. The ABI itself has a nice, meaningful structure. - Extensible ABI: new fields can be added without breaking tooling. New tracepoints can be added as the hardware side evolves. There's various parsers that can be used. - Lots of scheduling/buffering/batching modes of operandi for MCE events. poll() support. mmap() support. read() support. You name it. - Rich tooling support: even without any MCE specific extensions added the 'perf' tool today offers various views of MCE data: perf report, perf stat, perf trace can all be used to view logged MCE events and perhaps correlate them to certain user-space usage patterns. But it can be used directly as well, for user-space agents and policy action in mcelog, etc. With this we hope to achieve significant code cleanup and feature improvements in the MCE code, and we hope to be able to drop the /dev/mcelog facility in the end. This patch is just a plain dumb dump of mce_log() records to the tracepoints / perf events framework - a first proof of concept step. Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Andi Kleen <ak@linux.intel.com> LKML-Reference: <4AD42A0D.7050104@jp.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | x86: Fix comments of register/stack access functionsMasami Hiramatsu2009-12-021-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix typos and some redundant comments of register/stack access functions in asm/ptrace.h. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: systemtap <systemtap@sources.redhat.com> Cc: DLE <dle-develop@lists.sourceforge.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Roland McGrath <roland@redhat.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Wenji Huang <wenji.huang@oracle.com> Cc: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> LKML-Reference: <20091201000222.7669.7477.stgit@harusame> Signed-off-by: Ingo Molnar <mingo@elte.hu> Suggested-by: Wenji Huang <wenji.huang@oracle.com>
| * | | | | | hw-breakpoints: Keep track of user disabled breakpointsFrederic Weisbecker2009-12-021-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we disable a breakpoint through dr7, we unregister it right away, making us lose track of its corresponding address register value. It means that the following sequence would be unsupported: - set address in dr0 - enable it through dr7 - disable it through dr7 - enable it through dr7 because we lost the address register value when we disabled the breakpoint. Don't unregister the disabled breakpoints but rather disable them. Reported-by: "K.Prasad" <prasad@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1259735536-9236-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | hw-breakpoints: Use struct perf_event_attr to define user breakpointsFrederic Weisbecker2009-11-271-33/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In-kernel user breakpoints are created using functions in which we pass breakpoint parameters as individual variables: address, length and type. Although it fits well for x86, this just does not scale across archictectures that may support this api later as these may have more or different needs. Pass in a perf_event_attr structure instead because it is meant to evolve as much as possible into a generic hardware breakpoint parameter structure. Reported-by: K.Prasad <prasad@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1259294154-5197-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | x86/hw-breakpoints: Don't lose GE flag while disabling a breakpointFrederic Weisbecker2009-11-261-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we schedule out a breakpoint from the cpu, we also incidentally remove the "Global exact breakpoint" flag from the breakpoint control register. It makes us losing the fine grained precision about the origin of the instructions that may trigger breakpoint exceptions for the other breakpoints running in this cpu. Reported-by: Prasad <prasad@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1259211878-6013-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | hw-breakpoints: Simplify error handling in breakpoint creation requestsFrederic Weisbecker2009-11-261-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This simplifies the error handling when we create a breakpoint. We don't need to check the NULL return value corner case anymore since we have improved perf_event_create_kernel_counter() to always return an error code in the failure case. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Prasad <prasad@linux.vnet.ibm.com> LKML-Reference: <1259210142-5714-3-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>