From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 From: Jeremy Cline Date: Tue, 23 Jul 2019 15:24:30 +0000 Subject: [PATCH] kdump: add support for crashkernel=auto Rebased for v5.3-rc1 because the documentation has moved. Message-id: <20180604013831.574215750@redhat.com> Patchwork-id: 8166 O-Subject: [kernel team] [PATCH RHEL8.0 V2 2/2] kdump: add support for crashkernel=auto Bugzilla: 1507353 RH-Acked-by: Don Zickus RH-Acked-by: Baoquan He RH-Acked-by: Pingfan Liu Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1507353 Build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16534135 Tested: ppc64le, x86_64 with several memory sizes. kdump qe tested 160M on various x86 machines in lab. We continue to provide crashkernel=auto like we did in RHEL6 and RHEL7, this will simplify the kdump deployment for common use cases that kdump just works with the auto reserved values. But this is still a best effort estimation, we can not know the exact memory requirement because it depends on a lot of different factors. The implementation of crashkernel=auto is simplified as a wrapper to use below kernel cmdline: x86_64: crashkernel=1G-64G:160M,64G-1T:256M,1T-:512M s390x: crashkernel=4G-64G:160M,64G-1T:256M,1T-:512M arm64: crashkernel=2G-:512M ppc64: crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G The difference between this way and the old implementation in RHEL6/7 is we do not scale the crash reserved memory size according to system memory size anymore. Latest effort to move upstream is below thread: https://lkml.org/lkml/2018/5/20/262 But unfortunately it is still unlikely to be accepted, thus we will still use a RHEL only patch in RHEL8. Copied old patch description about the history reason see below: ''' Non-upstream explanations: Besides "crashkenrel=X@Y" format, upstream also has advanced "crashkernel=range1:size1[,range2:size2,...][@offset]", and "crashkernel=X,high{low}" formats, but they need more careful manual configuration, and have different values for different architectures. Most of the distributions use the standard "crashkernel=X@Y" upstream format, and use crashkernel range format for advanced scenarios, heavily relying on the user's involvement. While "crashkernel=auto" is redhat's special feature, it exists and has been used as the default boot cmdline since 2008 rhel6. It does not require users to figure out how many crash memory size for their systems, also has been proved to be able to work pretty well for common scenarios. "crashkernel=auto" was tested/based on rhel-related products, as we have stable kernel configurations which means more or less stable memory consumption. In 2014 we tried to post them again to upstream but NACKed by people because they think it's not general and unnecessary, users can specify their own values or do that by scripts. However our customers insist on having it added to rhel. Also see one previous discussion related to this backport to Pegas: On 10/17/2016 at 10:15 PM, Don Zickus wrote: > On Fri, Oct 14, 2016 at 10:57:41AM +0800, Dave Young wrote: >> Don, agree with you we should evaluate them instead of just inherit >> them blindly. Below is what I think about kdump auto memory: >> There are two issues for crashkernel=auto in upstream: >> 1) It will be seen as a policy which should not go to kernel >> 2) It is hard to get a good number for the crash reserved size, >> considering various different kernel config options one can setups. >> In RHEL we are easier because our supported Kconfig is limited. >> I digged the upstream mail archive, but I'm not sure I got all the >> information, at least Michael Ellerman was objecting the series for >> 1). > Yes, I know. Vivek and I have argued about this for years. :-) > > I had hoped all the changes internally to the makedumpfile would allow > the memory configuration to stabilize at a number like 192M or 128M and > only in the rare cases extend beyond that. > > So I always treated that as a temporary hack until things were better. > With the hope of every new RHEL release we get smarter and better. :-) > Ideally it would be great if we could get the number down to 64M for most > cases and just turn it on in Fedora. Maybe someday.... ;-) > > We can have this conversation when the patch gets reposted/refreshed > for upstream on rhkl? > > Cheers, > Don We had proposed to drop the historic crashkernel=auto code and move to use crashkernel=range:size format and pass them in anaconda. The initial reason is crashkernel=range:size works just fine because we do not need complex algorithm to scale crashkernel reserved size any more. The old linear scaling is mainly for old makedumpfile requirements, now it is not necessary. But With the new approach, backward compatibility is potentially at risk. For e.g. let's consider the following cases: 1) When we upgrade from an older distribution like rhel-alt-7.4(which uses crashkernel=auto) to rhel-alt-7.5 (which uses the crashkernel=xY format) In this case we can use anaconda scripts for checking 'crashkernel=auto' in kernel spec and update to the new 'crashkernel=range:size' format. 2) When we upgrade from rhel-alt-7.5(which uses crashkernel=xY format) to rhel-alt-7.6(which uses crashkernel=xY format), but the x and/or Y values are changed in rhel-alt-7.6. For example from crashkernel=2G-:160M to crashkernel=2G-:192M, then we have no way to determine if the X and/or Y values were distribution provided or user specified ones. Since it is recommended to give precedence to user-specified values, so we cannot do an upgrade in such a case." Thus turn back to resolve it in kernel, and add a simpler version which just hacks to use the range:size style in code, and make rhel-only code easily to maintain. ''' Signed-off-by: Dave Young Signed-off-by: Herton R. Krzesinski Upstream Status: RHEL only Signed-off-by: Jeremy Cline --- Documentation/admin-guide/kdump/kdump.rst | 11 +++++++++++ kernel/crash_core.c | 14 ++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst index 2da65fef2a1c..d53a524f80f0 100644 --- a/Documentation/admin-guide/kdump/kdump.rst +++ b/Documentation/admin-guide/kdump/kdump.rst @@ -285,6 +285,17 @@ This would mean: 2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M 3) if the RAM size is larger than 2G, then reserve 128M +Or you can use crashkernel=auto if you have enough memory. The threshold +is 2G on x86_64, arm64, ppc64 and ppc64le. The threshold is 4G for s390x. +If your system memory is less than the threshold crashkernel=auto will not +reserve memory. + +The automatically reserved memory size varies based on architecture. +The size changes according to system memory size like below: + x86_64: 1G-64G:160M,64G-1T:256M,1T-:512M + s390x: 4G-64G:160M,64G-1T:256M,1T-:512M + arm64: 2G-:512M + ppc64: 2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G Boot into System Kernel diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 0b3f738178b8..a04650f7f6c2 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -260,6 +260,20 @@ static int __init __parse_crashkernel(char *cmdline, if (suffix) return parse_crashkernel_suffix(ck_cmdline, crash_size, suffix); + + if (strncmp(ck_cmdline, "auto", 4) == 0) { +#ifdef CONFIG_X86_64 + ck_cmdline = "1G-64G:160M,64G-1T:256M,1T-:512M"; +#elif defined(CONFIG_S390) + ck_cmdline = "4G-64G:160M,64G-1T:256M,1T-:512M"; +#elif defined(CONFIG_ARM64) + ck_cmdline = "2G-:512M"; +#elif defined(CONFIG_PPC64) + ck_cmdline = "2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G"; +#endif + pr_info("Using crashkernel=auto, the size choosed is a best effort estimation.\n"); + } + /* * if the commandline contains a ':', then that's the extended * syntax -- if not, it must be the classic syntax -- 2.28.0