summaryrefslogtreecommitdiffstats
path: root/Documentation/vm/hwpoison.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/vm/hwpoison.txt')
-rw-r--r--Documentation/vm/hwpoison.txt182
1 files changed, 0 insertions, 182 deletions
diff --git a/Documentation/vm/hwpoison.txt b/Documentation/vm/hwpoison.txt
deleted file mode 100644
index 5500684..0000000
--- a/Documentation/vm/hwpoison.txt
+++ /dev/null
@@ -1,182 +0,0 @@
-What is hwpoison?
-
-Upcoming Intel CPUs have support for recovering from some memory errors
-(``MCA recovery''). This requires the OS to declare a page "poisoned",
-kill the processes associated with it and avoid using it in the future.
-
-This patchkit implements the necessary infrastructure in the VM.
-
-To quote the overview comment:
-
- * High level machine check handler. Handles pages reported by the
- * hardware as being corrupted usually due to a 2bit ECC memory or cache
- * failure.
- *
- * This focusses on pages detected as corrupted in the background.
- * When the current CPU tries to consume corruption the currently
- * running process can just be killed directly instead. This implies
- * that if the error cannot be handled for some reason it's safe to
- * just ignore it because no corruption has been consumed yet. Instead
- * when that happens another machine check will happen.
- *
- * Handles page cache pages in various states. The tricky part
- * here is that we can access any page asynchronous to other VM
- * users, because memory failures could happen anytime and anywhere,
- * possibly violating some of their assumptions. This is why this code
- * has to be extremely careful. Generally it tries to use normal locking
- * rules, as in get the standard locks, even if that means the
- * error handling takes potentially a long time.
- *
- * Some of the operations here are somewhat inefficient and have non
- * linear algorithmic complexity, because the data structures have not
- * been optimized for this case. This is in particular the case
- * for the mapping from a vma to a process. Since this case is expected
- * to be rare we hope we can get away with this.
-
-The code consists of a the high level handler in mm/memory-failure.c,
-a new page poison bit and various checks in the VM to handle poisoned
-pages.
-
-The main target right now is KVM guests, but it works for all kinds
-of applications. KVM support requires a recent qemu-kvm release.
-
-For the KVM use there was need for a new signal type so that
-KVM can inject the machine check into the guest with the proper
-address. This in theory allows other applications to handle
-memory failures too. The expection is that near all applications
-won't do that, but some very specialized ones might.
-
----
-
-There are two (actually three) modi memory failure recovery can be in:
-
-vm.memory_failure_recovery sysctl set to zero:
- All memory failures cause a panic. Do not attempt recovery.
- (on x86 this can be also affected by the tolerant level of the
- MCE subsystem)
-
-early kill
- (can be controlled globally and per process)
- Send SIGBUS to the application as soon as the error is detected
- This allows applications who can process memory errors in a gentle
- way (e.g. drop affected object)
- This is the mode used by KVM qemu.
-
-late kill
- Send SIGBUS when the application runs into the corrupted page.
- This is best for memory error unaware applications and default
- Note some pages are always handled as late kill.
-
----
-
-User control:
-
-vm.memory_failure_recovery
- See sysctl.txt
-
-vm.memory_failure_early_kill
- Enable early kill mode globally
-
-PR_MCE_KILL
- Set early/late kill mode/revert to system default
- arg1: PR_MCE_KILL_CLEAR: Revert to system default
- arg1: PR_MCE_KILL_SET: arg2 defines thread specific mode
- PR_MCE_KILL_EARLY: Early kill
- PR_MCE_KILL_LATE: Late kill
- PR_MCE_KILL_DEFAULT: Use system global default
-PR_MCE_KILL_GET
- return current mode
-
-
----
-
-Testing:
-
-madvise(MADV_HWPOISON, ....)
- (as root)
- Poison a page in the process for testing
-
-
-hwpoison-inject module through debugfs
-
-/sys/debug/hwpoison/
-
-corrupt-pfn
-
-Inject hwpoison fault at PFN echoed into this file. This does
-some early filtering to avoid corrupted unintended pages in test suites.
-
-unpoison-pfn
-
-Software-unpoison page at PFN echoed into this file. This
-way a page can be reused again.
-This only works for Linux injected failures, not for real
-memory failures.
-
-Note these injection interfaces are not stable and might change between
-kernel versions
-
-corrupt-filter-dev-major
-corrupt-filter-dev-minor
-
-Only handle memory failures to pages associated with the file system defined
-by block device major/minor. -1U is the wildcard value.
-This should be only used for testing with artificial injection.
-
-corrupt-filter-memcg
-
-Limit injection to pages owned by memgroup. Specified by inode number
-of the memcg.
-
-Example:
- mkdir /sys/fs/cgroup/mem/hwpoison
-
- usemem -m 100 -s 1000 &
- echo `jobs -p` > /sys/fs/cgroup/mem/hwpoison/tasks
-
- memcg_ino=$(ls -id /sys/fs/cgroup/mem/hwpoison | cut -f1 -d' ')
- echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg
-
- page-types -p `pidof init` --hwpoison # shall do nothing
- page-types -p `pidof usemem` --hwpoison # poison its pages
-
-corrupt-filter-flags-mask
-corrupt-filter-flags-value
-
-When specified, only poison pages if ((page_flags & mask) == value).
-This allows stress testing of many kinds of pages. The page_flags
-are the same as in /proc/kpageflags. The flag bits are defined in
-include/linux/kernel-page-flags.h and documented in
-Documentation/vm/pagemap.txt
-
-Architecture specific MCE injector
-
-x86 has mce-inject, mce-test
-
-Some portable hwpoison test programs in mce-test, see blow.
-
----
-
-References:
-
-http://halobates.de/mce-lc09-2.pdf
- Overview presentation from LinuxCon 09
-
-git://git.kernel.org/pub/scm/utils/cpu/mce/mce-test.git
- Test suite (hwpoison specific portable tests in tsrc)
-
-git://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git
- x86 specific injector
-
-
----
-
-Limitations:
-
-- Not all page types are supported and never will. Most kernel internal
-objects cannot be recovered, only LRU pages for now.
-- Right now hugepage support is missing.
-
----
-Andi Kleen, Oct 2009
-