Kernel Linux

Linux Kernel Watchdog Explained

Linux Kernel Watchdog Explained

Linux Kernel Watchdog

The Linux kernel watchdog is used to monitor if a system is running. It is supposed to automatically reboot hanged systems due to unrecoverable software errors. The watchdog module is specific to the hardware or chip being used. Personal computer users don't need watchdog as they can reset the system manually. However, it is useful for systems that are mission critical and need the ability to reboot themselves without human intervention. For example, servers on a remote location or embedded equipment on a spacecraft that need automatic hardware reset capabilities.

Warning: Proceed with Caution

Wrong configurations of a watchdog on your system can cause problems like:

So avoid using live servers to test Linux kernel watchdog.

Watchdog Module

Watchdog functionality on the hardware side sets up a timer that times out after a predetermined period. The watchdog software then periodically refreshes the hardware timer. If the software stops refreshing, then after the predetermined period, the timer performs a hardware reset of the device. In order for a watchdog timer to be functional, the motherboard manufacturer has to use the chip's watchdog functionality. Often the documentation from the manufacturer is not clear about whether the functionality was implemented. In that case, you have to test it out.

Also, you need the right watchdog kernel module to be loaded in your Linux system. Different chips use different modules. For example:

After the module is loaded, you can check /dev/watchdog on the Linux system. If this file is present, that means the watchdog kernel device driver or module was loaded. The system periodically keeps writing to /dev/watchdog. It is also called “kicking or feeding the watchdog”. If the system fails to kick or feed the watchdog, then after a while the system is hard reset.

Watchdog Daemon

The watchdog daemon opens the device and provides the necessary refresh to keep the system from resetting. It can test process table space, memory usage, file accessibility, work overload, file table overflow, IP address ping, network interface traffic, temperature, running processes and more. If the tests fail, then watchdog causes a shutdown.

Starting and Stopping Watchdog

Watchdog daemon should start at boot time and put itself in the background. You can check if it is running:

ps -af | grep watch*

If the kernel is NOT compiled with CONFIG_WATCHDOG_NOWAYOUT, then if you close the /dev/watchdog properly, it will not cause a reboot. You can write the character V into /dev/watchdog and then close the file. This should stop the watchdog.

Testing the Watchdog

If you want to test if the hardware watchdog is working, you can do the following from your administrator command prompt:

cat >> /dev/watchdog

And press “enter” twice and wait. The prompt will not come back. After awhile depending on your kernel's setting, the system should perform the hard reboot.

References:

Battle For Wesnoth 1.13.6 Development Released
Battle For Wesnoth 1.13.6 released last month, is the sixth development release in the 1.13.x series and it delivers a number of improvements, most no...
Cum se instalează League Of Legends pe Ubuntu 14.04
Dacă ești fan al League of Legends, atunci aceasta este o oportunitate pentru tine de a testa rula League of Legends. Rețineți că LOL este acceptat pe...
Instalați cel mai recent joc de strategie OpenRA pe Ubuntu Linux
OpenRA este un motor de jocuri de strategie în timp real Libre / Free care recreează primele jocuri Westwood, cum ar fi clasicul Command & Conquer: Re...