Examine WHEA_UNCORRECTABLE_ERROR (124) BSOD with WinDbg

One of my computers recently crashed with a BSOD.

This occurs very rarely so I decided to identify the cause.

Troubleshooting

I checked the system event log for a bugcheck and found this:

Log Name:      System
Source:        Microsoft-Windows-WER-SystemErrorReporting
...
Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x00000124 (0x0000000000000000, 0xfffffa800d91b038, 0x00000000b2004000, 0x0000000029000175). A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: 040116-21216-01.

 

I decided to examine the C:\Windows\MEMORY.DMP crash dump with WinDbg. (In this case the x64 version of WinDbg)

WinDbg’s !analyze command usually reveals relevant information about a BSOD, so that’s what I checked first:

Run: !analyze -v

0: kd> !analyze -v

...
WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
WHEA_ERROR_RECORD structure that describes the error conditon.
Arguments:
Arg1: 0000000000000000, Machine Check Exception
Arg2: fffffa800d91b038, Address of the WHEA_ERROR_RECORD structure.
Arg3: 00000000b2004000, High order 32-bits of the MCi_STATUS value.
Arg4: 0000000029000175, Low order 32-bits of the MCi_STATUS value.

...

PRIMARY_PROBLEM_CLASS:  X64_0x124_AuthenticAMD_PROCESSOR_CACHE

 

It seemed to be a hardware error related to the processor cache.

For more details I looked at the WHEA_ERROR_RECORD information:

(Only section 2 with the actual error shown)

0: kd> !errrec fffffa800d91b038
===============================================================================
Common Platform Error Record @ fffffa800d91b038
-------------------------------------------------------------------------------
Record Id     : 01d17af4b6b560a4
Severity      : Fatal (1)
Length        : 928
Creator       : Microsoft
Notify Type   : Machine Check Exception
Timestamp     : 4/1/2016 18:19:44 (UTC)
Flags         : 0x00000000

...

===============================================================================
Section 2     : x86/x64 MCA
-------------------------------------------------------------------------------
Descriptor    @ fffffa800d91b148
Section       @ fffffa800d91b2d0
Offset        : 664
Length        : 264
Flags         : 0x00000000
Severity      : Fatal

Error         : DCACHEL1_EVICT_ERR (Proc 0 Bank 0)
Status      : 0xb200400029000175

 

Apparently a hardware error releated to the level 1 data cache caused the system crash.

 

The computer in question has an AMD Athlon II X2 280 CPU.

Using CPU-Z I noticed that the core voltage seemed a little low for this CPU.

I remembered that I had undervolted the CPU to save power.

(Did not have reliability problems with it for years until now)

 

I checked the BIOS settings and discovered that the CPU was undervolted by -0,15 volts.

I decided to change it to -0,1 volts.

If any other reliability problems occur, I will change it back to standard voltage.

Conclusion

If hardware is running out of specifications and system crashes occur, then adjust settings closer to specifications.

(Examples of running out of spec: Undervolting, overvolting and overclocking)