diff options
Diffstat (limited to 'Documentation/vm/hwpoison.txt')
-rw-r--r-- | Documentation/vm/hwpoison.txt | 182 |
1 files changed, 0 insertions, 182 deletions
diff --git a/Documentation/vm/hwpoison.txt b/Documentation/vm/hwpoison.txt deleted file mode 100644 index 55006846660..00000000000 --- a/Documentation/vm/hwpoison.txt +++ /dev/null @@ -1,182 +0,0 @@ -What is hwpoison? - -Upcoming Intel CPUs have support for recovering from some memory errors -(``MCA recovery''). This requires the OS to declare a page "poisoned", -kill the processes associated with it and avoid using it in the future. - -This patchkit implements the necessary infrastructure in the VM. - -To quote the overview comment: - - * High level machine check handler. Handles pages reported by the - * hardware as being corrupted usually due to a 2bit ECC memory or cache - * failure. - * - * This focusses on pages detected as corrupted in the background. - * When the current CPU tries to consume corruption the currently - * running process can just be killed directly instead. This implies - * that if the error cannot be handled for some reason it's safe to - * just ignore it because no corruption has been consumed yet. Instead - * when that happens another machine check will happen. - * - * Handles page cache pages in various states. The tricky part - * here is that we can access any page asynchronous to other VM - * users, because memory failures could happen anytime and anywhere, - * possibly violating some of their assumptions. This is why this code - * has to be extremely careful. Generally it tries to use normal locking - * rules, as in get the standard locks, even if that means the - * error handling takes potentially a long time. - * - * Some of the operations here are somewhat inefficient and have non - * linear algorithmic complexity, because the data structures have not - * been optimized for this case. This is in particular the case - * for the mapping from a vma to a process. Since this case is expected - * to be rare we hope we can get away with this. - -The code consists of a the high level handler in mm/memory-failure.c, -a new page poison bit and various checks in the VM to handle poisoned -pages. - -The main target right now is KVM guests, but it works for all kinds -of applications. KVM support requires a recent qemu-kvm release. - -For the KVM use there was need for a new signal type so that -KVM can inject the machine check into the guest with the proper -address. This in theory allows other applications to handle -memory failures too. The expection is that near all applications -won't do that, but some very specialized ones might. - ---- - -There are two (actually three) modi memory failure recovery can be in: - -vm.memory_failure_recovery sysctl set to zero: - All memory failures cause a panic. Do not attempt recovery. - (on x86 this can be also affected by the tolerant level of the - MCE subsystem) - -early kill - (can be controlled globally and per process) - Send SIGBUS to the application as soon as the error is detected - This allows applications who can process memory errors in a gentle - way (e.g. drop affected object) - This is the mode used by KVM qemu. - -late kill - Send SIGBUS when the application runs into the corrupted page. - This is best for memory error unaware applications and default - Note some pages are always handled as late kill. - ---- - -User control: - -vm.memory_failure_recovery - See sysctl.txt - -vm.memory_failure_early_kill - Enable early kill mode globally - -PR_MCE_KILL - Set early/late kill mode/revert to system default - arg1: PR_MCE_KILL_CLEAR: Revert to system default - arg1: PR_MCE_KILL_SET: arg2 defines thread specific mode - PR_MCE_KILL_EARLY: Early kill - PR_MCE_KILL_LATE: Late kill - PR_MCE_KILL_DEFAULT: Use system global default -PR_MCE_KILL_GET - return current mode - - ---- - -Testing: - -madvise(MADV_HWPOISON, ....) - (as root) - Poison a page in the process for testing - - -hwpoison-inject module through debugfs - -/sys/debug/hwpoison/ - -corrupt-pfn - -Inject hwpoison fault at PFN echoed into this file. This does -some early filtering to avoid corrupted unintended pages in test suites. - -unpoison-pfn - -Software-unpoison page at PFN echoed into this file. This -way a page can be reused again. -This only works for Linux injected failures, not for real -memory failures. - -Note these injection interfaces are not stable and might change between -kernel versions - -corrupt-filter-dev-major -corrupt-filter-dev-minor - -Only handle memory failures to pages associated with the file system defined -by block device major/minor. -1U is the wildcard value. -This should be only used for testing with artificial injection. - -corrupt-filter-memcg - -Limit injection to pages owned by memgroup. Specified by inode number -of the memcg. - -Example: - mkdir /sys/fs/cgroup/mem/hwpoison - - usemem -m 100 -s 1000 & - echo `jobs -p` > /sys/fs/cgroup/mem/hwpoison/tasks - - memcg_ino=$(ls -id /sys/fs/cgroup/mem/hwpoison | cut -f1 -d' ') - echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg - - page-types -p `pidof init` --hwpoison # shall do nothing - page-types -p `pidof usemem` --hwpoison # poison its pages - -corrupt-filter-flags-mask -corrupt-filter-flags-value - -When specified, only poison pages if ((page_flags & mask) == value). -This allows stress testing of many kinds of pages. The page_flags -are the same as in /proc/kpageflags. The flag bits are defined in -include/linux/kernel-page-flags.h and documented in -Documentation/vm/pagemap.txt - -Architecture specific MCE injector - -x86 has mce-inject, mce-test - -Some portable hwpoison test programs in mce-test, see blow. - ---- - -References: - -http://halobates.de/mce-lc09-2.pdf - Overview presentation from LinuxCon 09 - -git://git.kernel.org/pub/scm/utils/cpu/mce/mce-test.git - Test suite (hwpoison specific portable tests in tsrc) - -git://git.kernel.org/pub/scm/utils/cpu/mce/mce-inject.git - x86 specific injector - - ---- - -Limitations: - -- Not all page types are supported and never will. Most kernel internal -objects cannot be recovered, only LRU pages for now. -- Right now hugepage support is missing. - ---- -Andi Kleen, Oct 2009 - |