Identifying Bad Memory

I was having problems recently with a dedicated production server, that runs my MySQL Server and a number of websites. It’s most annoying when your system crashes without any reporting in /var/log/messages

The tool of choice from the host provider SoftLayer was PassMark BurnInTest Linux which is installed with every dedicated server.

I will need to investigate open source alternatives, as this is a commercial product, but for the purposes of my pain, this included tool was well worth the investment.

**************
RESULT SUMMARY
**************
Test Start time: Sun Feb 22 16:02:48 2009
Test Stop time: Sun Feb 22 16:07:49 2009
Test Duration: 000h 05m 01s

Test Name Cycles Operations Result Errors Last Error
CPU - Maths 261 488 Billion PASS 0 No errors
Memory (RAM) 2 3.081 Billion FAIL 1 Error verifying data in RAM
Network: 127.0.0.1 412995 4.295 Billion PASS 0 No errors
TEST RUN FAILED

*********************
SERIOUS ERROR SUMMARY
*********************
SERIOUS : 2009-02-22 16:07:31, RAM, SERIOUS: Error verifying data in RAM (x 1)

It was great to get a simple resolution to the problem, bad memory?
With a scheduled maintenance replacement I was operational again.

 **************
RESULT SUMMARY
**************
Test Start time: Sun Feb 22 20:34:37 2009
Test Stop time: Sun Feb 22 20:39:38 2009
Test Duration: 000h 05m 01s

Test Name Cycles Operations Result Errors Last Error
CPU - Maths 267 406 Billion PASS 0 No errors
Memory (RAM) 1 3.664 Billion PASS 0 No errors
Network: 127.0.0.1 334578 3.480 Billion PASS 0 No errors
TEST RUN PASSED

*********************
SERIOUS ERROR SUMMARY
*********************

Comments

  1. David Freeman says

    When RAM is already suspect and if one has console access, I think that installing and booting to memtest86+ or its equivalent seems soundest and most versatile. Both BIOS and OS utilities’ memory tests can be somewhat superficial by comparison with those of memtest86+.

    For the RHEL OS family there is a package containing source for memtester (CLI), which is executed against the running kernel. Still, if its tests provoke a crash, one is sort of sent back to the well to refill the leaky bucket, as it were.

    Another issue is the expectation of full desktop style GUI support in the OS environment of the (production?) machine to be tested. Doesn’t the passmark utility require X libraries?