Kdump is an advanced crash dumping mechanism. When enabled, the system is booted from the context of another kernel. This second kernel reserves a small amount of memory, and its only purpose is to capture the core dump image in case the system crashes. Since being able to analyze the core dump helps significantly to determine the exact cause of the system failure, it is strongly recommended to have this feature enabled.
1. Install the kexec-tools
Kexec is a fastboot mechanism which allows booting a Linux kernel from the context of already running kernel without going through BIOS. Kdump uses kexec to boot into a second kernel whenever the system crashes.
# up2date --nox -u kexec-tools Fetching Obsoletes list for channel: el5_i386_latest... ######################################## Fetching rpm headers... ######################################## Name Version Rel ---------------------------------------------------------- kexec-tools 1.101 194.4.el5.0.1 i386 Testing package set / solving RPM inter-dependencies... ######################################## kexec-tools-1.101-194.4.el5 ########################## Done. Preparing ########################################### [100%] Installing... 1:kexec-tools ########################################### [100%]
2. Check the file /boot/config-`uname -r`
The values specified should denote that kexec is enabled and this kernel can be used as a crash kernel:
# cat /boot/config-`uname -r` CONFIG_KEXEC=y CONFIG_CRASH_DUMP=y ...
3. Modify the system kernel to reserve space for the crash kernel
Edit the /etc/grub.conf file and add “crashkernel=128M@16M” to the kernel line to reserve 128MB of memory, starting at physical address 0x01000000 (16MB)
# vi /etc/grub.conf ... title Red Hat Enterprise Linux Server (2.6.18-8.el5) root (hd0,0) kernel /vmlinuz-2.6.18-8.el5 ro root=/dev/VolGroup00/LogVol00 rhgb quiet crashkernel=128M@16M initrd /initrd-2.6.18-8.el5.img ...
The amount of reserved memory may vary depending amount of memory on the system.
4. Specify where the vmcore should be created
Different types of dump target locations can be specified in the /etc/kdump.conf file. You can specify a directory of your choice in this file. For example:
path /usr/local/cores
Following is a sample entry that uses NFS as the location for the dump target. For example, below will mount the filesystem and copy
the vmcore file to the NFS server
net my.server.com:/export/tmp
For more options, please check the /etc/kdump.conf.
5. Update kdump configuration file – /etc/sysconfig/kdump (optional)
This file defines the dump-capture kernel specification, including its name/location, and command line for the kernel if it is to be different from the currently running kernel.
# cat /etc/sysconfig/kdump KDUMP_KERNELVER="" KDUMP_COMMANDLINE="" KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1" KEXEC_ARGS=" --args-linux" KDUMP_BOOTDIR="/boot" KDUMP_IMG="vmlinuz"
KDUMP_COMMANDLINE modify the default crash kernel command line from /proc/cmdline KDUMP_COMMANDLINE_APPEND adds irqpoll and maxcpus=1 to the command line for the crash kernel KEXEC_ARGS adds --args-linux to the kexec command line KDUMP_BOOTDIR is set to /boot KDUMP_IMG specifies the crash kernel image name, defaulting to /boot/vmlinuz with the current kernel version appended
6. Enable the kdump service
Set kdump service can be started when system rebooted.
# chkconfig kdump on
Note: it cannot be started as the new kernel parameter is not yet in effect.
7. Reboot the system for kdump configuration to take effect
Verify that kdump is active:
# cat /proc/cmdline ro root=/dev/VolGroup00/LogVol00 rhgb quiet crashkernel=128M@16M
# /etc/init.d/kdump status Kdump is operational
# /sbin/chkconfig --list |grep kdump kdump 0:off 1:off 2:on 3:on 4:on 5:on 6:off
8. Test kdump by crashing the system
# echo c > /proc/sysrq-trigger
This causes the kernel to panic, followed by the system restarting into the kdump kernel. When the boot process gets to the point where it starts the kdump service, the vmcore file should be copied to disk to the location specified in the /etc/kdump.conf file.