diff options
Diffstat (limited to 'pci-crs-fixes.patch')
-rw-r--r-- | pci-crs-fixes.patch | 593 |
1 files changed, 0 insertions, 593 deletions
diff --git a/pci-crs-fixes.patch b/pci-crs-fixes.patch deleted file mode 100644 index b4fd01502..000000000 --- a/pci-crs-fixes.patch +++ /dev/null @@ -1,593 +0,0 @@ - This revision is to address two problems found by Horst H. von Brand while - testing the v2 patches in Fedora: - https://bugzilla.redhat.com/show_bug.cgi?id=637647 - On his machine, we don't use _CRS by default, and the BIOS left some bridge - windows disabled. - - Problem 1: When we assigned space for the windows, we started at the top - and allocated [mem 0xffffffffffe00000-0xffffffffffffffff], which is - obviously useless because the CPU doesn't support physical addresses that - large. - - Problem 2: Subsequent allocations failed because I made an error in - find_resource(). We look for available space from [child->end + 1 to - root->end], and if the last child ends exactly at 0xffffffffffffffff, we - wrap around and start from zero. - - I made the top-down allocation conditional: an arch can select it at - boot-time, and there's a kernel command line option to change it for - debugging. - - -When we move PCI devices, we currently allocate space bottom-up, i.e., we look -at PCI bus resources in the order we found them, we look at gaps between child -resources bottom-up, and we align the new space at the bottom of an available -region. - -On x86, we move PCI devices more than we used to because we now pay attention -to the PCI host bridge windows from ACPI. For example, when we find a device -that's outside all the known host bridge windows, we try to move it into a -window, and we look for space starting at the bottom. - -Windows does similar device moves, but it looks for space top-down rather than -bottom-up. Since most machines are better-tested with Windows than Linux, this -difference means that Linux is more likely to trip over BIOS bugs in the PCI -host bridge window descriptions than Windows is. - -We've had several reports of Dell machines where the BIOS leaves the AHCI -controller outside the host bridge windows (BIOS bug #1), *and* the lowest -host bridge window includes an area that doesn't actually reach PCI (BIOS -bug #2). The result is that Windows (which moves AHCI to the top of a window) -works fine, while Linux (which moves AHCI to the bottom, buggy, area) doesn't -work. - -These patches change Linux to allocate space more like Windows does: - - 1) The x86 pcibios_align_resource() will choose space from the - end of an available area, not the beginning. - - 2) In the generic allocate_resource() path, we'll look for space - between existing children from the top, not from the bottom. - - 3) When pci_bus_alloc_resource() looks for available space, it - will start from the highest window, not the first one we found. - -This series fixes a 2.6.34 regression that prevents many Dell Precision -workstations from booting: - - https://bugzilla.kernel.org/show_bug.cgi?id=16228 - -Changes from v3 to v4: - - Use round_down() rather than adding ALIGN_DOWN(). - - Replace ARCH_HAS_TOP_DOWN_ALLOC #define with a boot-time architecture - choice and add a "resource_alloc_from_bottom" command line option to - revert to the old behavior (NOTE: this only affects allocate_resource(), - not pcibios_align_resource() or pci_bus_alloc_resource()). - - Fixed find_resource_from_top() again; it still didn't handle a child - that ended at the parent's end correctly. - -Changes from v2 to v3: - - Updated iomem_resource.end to reflect the end of usable physical address - space. Otherwise, we might allocate right up to 0xffffffff_ffffffff, - which isn't usable. - - Make allocate_resource() change conditional on ARCH_HAS_TOP_DOWN_ALLOC. - Without arch-specific changes like the above, it's too dangerous to - make this change for everybody at once. - - Fix 64-bit wraparound in find_resource(). If the last child happened - to end at ~0, we computed the highest available space as [child->end + 1, - root->end], which makes us think the available space started at 0, - which makes us return space that may already be allocated. - -Changes from v1 to v2: - - Moved check for allocating before the available area from - pcibios_align_resource() to find_resource(). Better to do it - after the alignment callback is done, and make it generic. - - Fixed pcibios_align_resource() alignment. If we start from the - end of the available area, we must align *downward*, not upward. - - Fixed pcibios_align_resource() ISA alias avoidance. Again, since - the starting point is the end of the area, we must align downward - when we avoid aliased areas. ---- - -Bjorn Helgaas (6): - resources: ensure alignment callback doesn't allocate below available start - resources: support allocating space within a region from the top down - PCI: allocate bus resources from the top down - x86/PCI: allocate space from the end of a region, not the beginning - x86: update iomem_resource end based on CPU physical address capabilities - x86: allocate space within a region top-down - - - Documentation/kernel-parameters.txt | 5 ++ - arch/x86/kernel/setup.c | 2 + - arch/x86/pci/i386.c | 17 ++++-- - drivers/pci/bus.c | 53 +++++++++++++++++-- - include/linux/ioport.h | 1 - kernel/resource.c | 99 ++++++++++++++++++++++++++++++++++- - 6 files changed, 163 insertions(+), 14 deletions(-) --- -To unsubscribe from this list: send the line "unsubscribe linux-pci" in -the body of a message to majordomo@vger.kernel.org -More majordomo info at http://vger.kernel.org/majordomo-info.html -The alignment callback returns a proposed location, which may have been -adjusted to avoid ISA aliases or for other architecture-specific reasons. -We already had a check ("tmp.start < tmp.end") to make sure the callback -doesn't return a location above the available area. - -This patch adds a check to make sure the callback doesn't return something -*below* the available area, as may happen if the callback tries to allocate -top-down. - -Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> ---- - - kernel/resource.c | 10 ++++++++-- - 1 files changed, 8 insertions(+), 2 deletions(-) - - -diff --git a/kernel/resource.c b/kernel/resource.c -index 7b36976..ace2269 100644 ---- a/kernel/resource.c -+++ b/kernel/resource.c -@@ -371,6 +371,7 @@ static int find_resource(struct resource *root, struct resource *new, - { - struct resource *this = root->child; - struct resource tmp = *new; -+ resource_size_t start; - - tmp.start = root->start; - /* -@@ -391,8 +392,13 @@ static int find_resource(struct resource *root, struct resource *new, - if (tmp.end > max) - tmp.end = max; - tmp.start = ALIGN(tmp.start, align); -- if (alignf) -- tmp.start = alignf(alignf_data, &tmp, size, align); -+ if (alignf) { -+ start = alignf(alignf_data, &tmp, size, align); -+ if (tmp.start <= start && start <= tmp.end) -+ tmp.start = start; -+ else -+ tmp.start = tmp.end; -+ } - if (tmp.start < tmp.end && tmp.end - tmp.start >= size - 1) { - new->start = tmp.start; - new->end = tmp.start + size - 1; - --- -To unsubscribe from this list: send the line "unsubscribe linux-pci" in -the body of a message to majordomo@vger.kernel.org -More majordomo info at http://vger.kernel.org/majordomo-info.html -Allocate space from the top of a region first, then work downward, -if an architecture desires this. - -When we allocate space from a resource, we look for gaps between children -of the resource. Previously, we always looked at gaps from the bottom up. -For example, given this: - - [mem 0xbff00000-0xf7ffffff] PCI Bus 0000:00 - [mem 0xbff00000-0xbfffffff] gap -- available - [mem 0xc0000000-0xdfffffff] PCI Bus 0000:02 - [mem 0xe0000000-0xf7ffffff] gap -- available - -we attempted to allocate from the [mem 0xbff00000-0xbfffffff] gap first, -then the [mem 0xe0000000-0xf7ffffff] gap. - -With this patch an architecture can choose to allocate from the top gap -[mem 0xe0000000-0xf7ffffff] first. - -We can't do this across the board because iomem_resource.end is initialized -to 0xffffffff_ffffffff on 64-bit architectures, and most machines can't -address the entire 64-bit physical address space. Therefore, we only -allocate top-down if the arch requests it by clearing -"resource_alloc_from_bottom". - -Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> ---- - - Documentation/kernel-parameters.txt | 5 ++ - include/linux/ioport.h | 1 - kernel/resource.c | 89 +++++++++++++++++++++++++++++++++++ - 3 files changed, 94 insertions(+), 1 deletions(-) - - -diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt -index 8dd7248..fe50cbd 100644 ---- a/Documentation/kernel-parameters.txt -+++ b/Documentation/kernel-parameters.txt -@@ -2156,6 +2156,11 @@ and is between 256 and 4096 characters. It is defined in the file - reset_devices [KNL] Force drivers to reset the underlying device - during initialization. - -+ resource_alloc_from_bottom -+ Allocate new resources from the beginning of available -+ space, not the end. If you need to use this, please -+ report a bug. -+ - resume= [SWSUSP] - Specify the partition device for software suspend - -diff --git a/include/linux/ioport.h b/include/linux/ioport.h -index b227902..d377ea8 100644 ---- a/include/linux/ioport.h -+++ b/include/linux/ioport.h -@@ -112,6 +112,7 @@ struct resource_list { - /* PC/ISA/whatever - the normal PC address spaces: IO and memory */ - extern struct resource ioport_resource; - extern struct resource iomem_resource; -+extern int resource_alloc_from_bottom; - - extern struct resource *request_resource_conflict(struct resource *root, struct resource *new); - extern int request_resource(struct resource *root, struct resource *new); -diff --git a/kernel/resource.c b/kernel/resource.c -index ace2269..8d337a9 100644 ---- a/kernel/resource.c -+++ b/kernel/resource.c -@@ -40,6 +40,23 @@ EXPORT_SYMBOL(iomem_resource); - - static DEFINE_RWLOCK(resource_lock); - -+/* -+ * By default, we allocate free space bottom-up. The architecture can request -+ * top-down by clearing this flag. The user can override the architecture's -+ * choice with the "resource_alloc_from_bottom" kernel boot option, but that -+ * should only be a debugging tool. -+ */ -+int resource_alloc_from_bottom = 1; -+ -+static __init int setup_alloc_from_bottom(char *s) -+{ -+ printk(KERN_INFO -+ "resource: allocating from bottom-up; please report a bug\n"); -+ resource_alloc_from_bottom = 1; -+ return 0; -+} -+early_param("resource_alloc_from_bottom", setup_alloc_from_bottom); -+ - static void *r_next(struct seq_file *m, void *v, loff_t *pos) - { - struct resource *p = v; -@@ -358,7 +375,74 @@ int __weak page_is_ram(unsigned long pfn) - } - - /* -+ * Find the resource before "child" in the sibling list of "root" children. -+ */ -+static struct resource *find_sibling_prev(struct resource *root, struct resource *child) -+{ -+ struct resource *this; -+ -+ for (this = root->child; this; this = this->sibling) -+ if (this->sibling == child) -+ return this; -+ -+ return NULL; -+} -+ -+/* -+ * Find empty slot in the resource tree given range and alignment. -+ * This version allocates from the end of the root resource first. -+ */ -+static int find_resource_from_top(struct resource *root, struct resource *new, -+ resource_size_t size, resource_size_t min, -+ resource_size_t max, resource_size_t align, -+ resource_size_t (*alignf)(void *, -+ const struct resource *, -+ resource_size_t, -+ resource_size_t), -+ void *alignf_data) -+{ -+ struct resource *this; -+ struct resource tmp = *new; -+ resource_size_t start; -+ -+ tmp.start = root->end; -+ tmp.end = root->end; -+ -+ this = find_sibling_prev(root, NULL); -+ for (;;) { -+ if (this) { -+ if (this->end < root->end) -+ tmp.start = this->end + 1; -+ } else -+ tmp.start = root->start; -+ if (tmp.start < min) -+ tmp.start = min; -+ if (tmp.end > max) -+ tmp.end = max; -+ tmp.start = ALIGN(tmp.start, align); -+ if (alignf) { -+ start = alignf(alignf_data, &tmp, size, align); -+ if (tmp.start <= start && start <= tmp.end) -+ tmp.start = start; -+ else -+ tmp.start = tmp.end; -+ } -+ if (tmp.start < tmp.end && tmp.end - tmp.start >= size - 1) { -+ new->start = tmp.start; -+ new->end = tmp.start + size - 1; -+ return 0; -+ } -+ if (!this || this->start == root->start) -+ break; -+ tmp.end = this->start - 1; -+ this = find_sibling_prev(root, this); -+ } -+ return -EBUSY; -+} -+ -+/* - * Find empty slot in the resource tree given range and alignment. -+ * This version allocates from the beginning of the root resource first. - */ - static int find_resource(struct resource *root, struct resource *new, - resource_size_t size, resource_size_t min, -@@ -435,7 +519,10 @@ int allocate_resource(struct resource *root, struct resource *new, - int err; - - write_lock(&resource_lock); -- err = find_resource(root, new, size, min, max, align, alignf, alignf_data); -+ if (resource_alloc_from_bottom) -+ err = find_resource(root, new, size, min, max, align, alignf, alignf_data); -+ else -+ err = find_resource_from_top(root, new, size, min, max, align, alignf, alignf_data); - if (err >= 0 && __request_resource(root, new)) - err = -EBUSY; - write_unlock(&resource_lock); - --- -To unsubscribe from this list: send the line "unsubscribe linux-pci" in -the body of a message to majordomo@vger.kernel.org -More majordomo info at http://vger.kernel.org/majordomo-info.html -Allocate space from the highest-address PCI bus resource first, then work -downward. - -Previously, we looked for space in PCI host bridge windows in the order -we discovered the windows. For example, given the following windows -(discovered via an ACPI _CRS method): - - pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff] - pci_root PNP0A03:00: host bridge window [mem 0x000c0000-0x000effff] - pci_root PNP0A03:00: host bridge window [mem 0x000f0000-0x000fffff] - pci_root PNP0A03:00: host bridge window [mem 0xbff00000-0xf7ffffff] - pci_root PNP0A03:00: host bridge window [mem 0xff980000-0xff980fff] - pci_root PNP0A03:00: host bridge window [mem 0xff97c000-0xff97ffff] - pci_root PNP0A03:00: host bridge window [mem 0xfed20000-0xfed9ffff] - -we attempted to allocate from [mem 0x000a0000-0x000bffff] first, then -[mem 0x000c0000-0x000effff], and so on. - -With this patch, we allocate from [mem 0xff980000-0xff980fff] first, then -[mem 0xff97c000-0xff97ffff], [mem 0xfed20000-0xfed9ffff], etc. - -Allocating top-down follows Windows practice, so we're less likely to -trip over BIOS defects in the _CRS description. - -On the machine above (a Dell T3500), the [mem 0xbff00000-0xbfffffff] region -doesn't actually work and is likely a BIOS defect. The symptom is that we -move the AHCI controller to 0xbff00000, which leads to "Boot has failed, -sleeping forever," a BUG in ahci_stop_engine(), or some other boot failure. - -Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c43 -Reference: https://bugzilla.redhat.com/show_bug.cgi?id=620313 -Reference: https://bugzilla.redhat.com/show_bug.cgi?id=629933 -Reported-by: Brian Bloniarz <phunge0@hotmail.com> -Reported-and-tested-by: Stefan Becker <chemobejk@gmail.com> -Reported-by: Denys Vlasenko <dvlasenk@redhat.com> -Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> ---- - - drivers/pci/bus.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++----- - 1 files changed, 48 insertions(+), 5 deletions(-) - - -diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c -index 7f0af0e..172bf26 100644 ---- a/drivers/pci/bus.c -+++ b/drivers/pci/bus.c -@@ -64,6 +64,49 @@ void pci_bus_remove_resources(struct pci_bus *bus) - } - } - -+/* -+ * Find the highest-address bus resource below the cursor "res". If the -+ * cursor is NULL, return the highest resource. -+ */ -+static struct resource *pci_bus_find_resource_prev(struct pci_bus *bus, -+ unsigned int type, -+ struct resource *res) -+{ -+ struct resource *r, *prev = NULL; -+ int i; -+ -+ pci_bus_for_each_resource(bus, r, i) { -+ if (!r) -+ continue; -+ -+ if ((r->flags & IORESOURCE_TYPE_BITS) != type) -+ continue; -+ -+ /* If this resource is at or past the cursor, skip it */ -+ if (res) { -+ if (r == res) -+ continue; -+ if (r->end > res->end) -+ continue; -+ if (r->end == res->end && r->start > res->start) -+ continue; -+ } -+ -+ if (!prev) -+ prev = r; -+ -+ /* -+ * A small resource is higher than a large one that ends at -+ * the same address. -+ */ -+ if (r->end > prev->end || -+ (r->end == prev->end && r->start > prev->start)) -+ prev = r; -+ } -+ -+ return prev; -+} -+ - /** - * pci_bus_alloc_resource - allocate a resource from a parent bus - * @bus: PCI bus -@@ -89,9 +132,10 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res, - resource_size_t), - void *alignf_data) - { -- int i, ret = -ENOMEM; -+ int ret = -ENOMEM; - struct resource *r; - resource_size_t max = -1; -+ unsigned int type = res->flags & IORESOURCE_TYPE_BITS; - - type_mask |= IORESOURCE_IO | IORESOURCE_MEM; - -@@ -99,10 +143,9 @@ pci_bus_alloc_resource(struct pci_bus *bus, struct resource *res, - if (!(res->flags & IORESOURCE_MEM_64)) - max = PCIBIOS_MAX_MEM_32; - -- pci_bus_for_each_resource(bus, r, i) { -- if (!r) -- continue; -- -+ /* Look for space at highest addresses first */ -+ r = pci_bus_find_resource_prev(bus, type, NULL); -+ for ( ; r; r = pci_bus_find_resource_prev(bus, type, r)) { - /* type_mask must match */ - if ((res->flags ^ r->flags) & type_mask) - continue; - --- -To unsubscribe from this list: send the line "unsubscribe linux-pci" in -the body of a message to majordomo@vger.kernel.org -More majordomo info at http://vger.kernel.org/majordomo-info.html -Allocate from the end of a region, not the beginning. - -For example, if we need to allocate 0x800 bytes for a device on bus -0000:00 given these resources: - - [mem 0xbff00000-0xdfffffff] PCI Bus 0000:00 - [mem 0xc0000000-0xdfffffff] PCI Bus 0000:02 - -the available space at [mem 0xbff00000-0xbfffffff] is passed to the -alignment callback (pcibios_align_resource()). Prior to this patch, we -would put the new 0x800 byte resource at the beginning of that available -space, i.e., at [mem 0xbff00000-0xbff007ff]. - -With this patch, we put it at the end, at [mem 0xbffff800-0xbfffffff]. - -Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c41 -Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> ---- - - arch/x86/pci/i386.c | 17 +++++++++++------ - 1 files changed, 11 insertions(+), 6 deletions(-) - - -diff --git a/arch/x86/pci/i386.c b/arch/x86/pci/i386.c -index 5525309..826140a 100644 ---- a/arch/x86/pci/i386.c -+++ b/arch/x86/pci/i386.c -@@ -65,16 +65,21 @@ pcibios_align_resource(void *data, const struct resource *res, - resource_size_t size, resource_size_t align) - { - struct pci_dev *dev = data; -- resource_size_t start = res->start; -+ resource_size_t start = round_down(res->end - size + 1, align); - - if (res->flags & IORESOURCE_IO) { -- if (skip_isa_ioresource_align(dev)) -- return start; -- if (start & 0x300) -- start = (start + 0x3ff) & ~0x3ff; -+ -+ /* -+ * If we're avoiding ISA aliases, the largest contiguous I/O -+ * port space is 256 bytes. Clearing bits 9 and 10 preserves -+ * all 256-byte and smaller alignments, so the result will -+ * still be correctly aligned. -+ */ -+ if (!skip_isa_ioresource_align(dev)) -+ start &= ~0x300; - } else if (res->flags & IORESOURCE_MEM) { - if (start < BIOS_END) -- start = BIOS_END; -+ start = res->end; /* fail; no space */ - } - return start; - } - --- -To unsubscribe from this list: send the line "unsubscribe linux-pci" in -the body of a message to majordomo@vger.kernel.org -More majordomo info at http://vger.kernel.org/majordomo-info.html -The iomem_resource map reflects the available physical address space. -We statically initialize the end to -1, i.e., 0xffffffff_ffffffff, but -of course we can only use as much as the CPU can address. - -This patch updates the end based on the CPU capabilities, so we don't -mistakenly allocate space that isn't usable, as we're likely to do when -allocating from the top-down. - -Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> ---- - - arch/x86/kernel/setup.c | 1 + - 1 files changed, 1 insertions(+), 0 deletions(-) - - -diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c -index c3a4fbb..922b5a1 100644 ---- a/arch/x86/kernel/setup.c -+++ b/arch/x86/kernel/setup.c -@@ -788,6 +788,7 @@ void __init setup_arch(char **cmdline_p) - - x86_init.oem.arch_setup(); - -+ iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1; - setup_memory_map(); - parse_setup_data(); - /* update the e820_saved too */ - --- -To unsubscribe from this list: send the line "unsubscribe linux-pci" in -the body of a message to majordomo@vger.kernel.org -More majordomo info at http://vger.kernel.org/majordomo-info.html -Request that allocate_resource() use available space from high addresses -first, rather than the default of using low addresses first. - -The most common place this makes a difference is when we move or assign -new PCI device resources. Low addresses are generally scarce, so it's -better to use high addresses when possible. This follows Windows practice -for PCI allocation. - -Reference: https://bugzilla.kernel.org/show_bug.cgi?id=16228#c42 -Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> ---- - - arch/x86/kernel/setup.c | 1 + - 1 files changed, 1 insertions(+), 0 deletions(-) - - -diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c -index 922b5a1..0fe76df 100644 ---- a/arch/x86/kernel/setup.c -+++ b/arch/x86/kernel/setup.c -@@ -788,6 +788,7 @@ void __init setup_arch(char **cmdline_p) - - x86_init.oem.arch_setup(); - -+ resource_alloc_from_bottom = 0; - iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1; - setup_memory_map(); - parse_setup_data(); - --- -To unsubscribe from this list: send the line "unsubscribe linux-pci" in -the body of a message to majordomo@vger.kernel.org -More majordomo info at http://vger.kernel.org/majordomo-info.html
\ No newline at end of file |