Hardware testing

From OpenVZ Virtuozzo Containers Wiki
Revision as of 13:48, 25 May 2006 by Kir (talk | contribs) (Filled in Memtester)
Jump to: navigation, search

Sometimes when you have a kernel panic, oops, or other fatal crash, this is not programmers whom to blame. This article describes how to properly test your hardware to check it's in a good shape.

RAM tests

Random Access Memory (RAM) is sometimes faulty, which leads to some very strange system crashes. It is though highly recommended to test your system RAM. A several approaches and tools can be used.

Memtest86 and Memtest86+

Memtest86 is a stand-alone RAM tester. It can either be booted from a CD, or from your normal Linux bootloader, such as GRUB or LILO.

Memtest86+ is a forked version of Memtest86 with some features added.

You can either download and install one of this programs from the sites above, or they can be a part of your Linux distribution already.

For Fedora Core, memtest86+ is available: yum install memtest86+

For Gentoo, both programs are available: emerge memtest86 emerge memtest86+

To test your system for faulty RAM, install either memtest and reboot into it. Run it for at least a few hours (at least 2-3 iterations). If there will be even a single error reported, you have to change your RAM chips (or, if your system is overclocked, downclock it to normal speed).

Memtester

Memtester is a userspace utility for testing the memory subsystem for faults. It is a part of some distributions.

For Fedora Core:

yum install memtester

For Gentoo:

emerge memtester

The good thing is you can test your memory without a need to reboot the server, and you can run other programs with it. The bad thing is not all the memory is tested.

Invoke memtester as a root, giving an amount of memory it will test as an argument, e.g.:

# /usr/sbin/memtester 512M

The more memory you will specify the better.

CPU cooling tests

FIXME cpuburn