Examining PFN_LIST_CORRUPT (4e) and PAGE_FAULT_IN_NONPAGED_AREA (50) BSOD

I recently experienced stability problems on a new work computer, which crashed with a BSOD.

 

I looked for clues in Event Viewer and found:

Log Name:      System
Source:        Microsoft-Windows-WER-SystemErrorReporting
Event ID:      1001
Task Category: None
Level:         Error
Keywords:      Classic
Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x0000004e (0x0000000000000099, 0x00000000003def55, 0x0000000000000000, 0x0000000000000001). A dump was saved in: C:\Windows\MEMORY.DMP.

 

Examined the memory dump with WinDbg (x64).

Checked for details about the crash with:

!analyze -v

Part of the result:

PFN_LIST_CORRUPT (4e)
Typically caused by drivers passing bad memory descriptor lists (ie: calling
MmUnlockPages twice with the same list, etc).  If a kernel debugger is
available get the stack trace.
Arguments:
Arg1: 0000000000000099, A PTE or PFN is corrupt
Arg2: 00000000003def55, page frame number
Arg3: 0000000000000000, current page state
Arg4: 0000000000000001, 0

 

Examined the call stack with:

kp

Result:

Child-SP          RetAddr           Call Site
fffff880`030b34f8 fffff800`0311c37c nt!KeBugCheckEx
fffff880`030b3500 fffff800`03038c17 nt!MiBadShareCount+0x4c
fffff880`030b3540 fffff800`030bc057 nt! ?? ::FNODOBFM::`string'+0x2cf6d
fffff880`030b36f0 fffff800`030bda09 nt!MiDeleteVirtualAddresses+0x41f
fffff880`030b38b0 fffff800`033a9f21 nt!MiRemoveMappedView+0xd9
fffff880`030b39d0 fffff800`033aa323 nt!MiUnmapViewOfSection+0x1b1
fffff880`030b3a90 fffff800`03089693 nt!NtUnmapViewOfSection+0x5f
fffff880`030b3ae0 00000000`76febfda nt!KiSystemServiceCopyEnd+0x13
00000000`0a8df5d8 00000000`00000000 0x76febfda

 

Memory problems are typically caused by failing memory modules, so I tested the memory with Memtest86+.

Only had time for running it for a short time, but it passed the test once.

However the next day the computer crashed again with another BSOD…

 

I found this in Event Viewer:

Log Name:      System
Source:        Microsoft-Windows-WER-SystemErrorReporting
Event ID:      1001
Task Category: None
Level:         Error
Keywords:      Classic
Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x00000050 (0xfffff8a0384b1280, 0x0000000000000000, 0xfffff800031fe133, 0x0000000000000000). A dump was saved in: C:\Windows\MEMORY.DMP.

 

Examined the new memory dump with:

!analyze -v

Part of the result:

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except,
it must be protected by a Probe.  Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: fffff8a0384b1280, memory referenced.
Arg2: 0000000000000000, value 0 = read operation, 1 = write operation.
Arg3: fffff800031fe133, If non-zero, the instruction address which referenced the bad memory
address.
Arg4: 0000000000000000, (reserved)

 

Examined the call stack with:

kp

Result:

Child-SP          RetAddr           Call Site
fffff880`031735f8 fffff800`031442be nt!KeBugCheckEx
fffff880`03173600 fffff800`030c552e nt! ?? ::FNODOBFM::`string'+0x3bc5f
fffff880`03173760 fffff800`031fe133 nt!KiPageFault+0x16e
fffff880`031738f0 fffff800`030af3b1 nt!ExFreePoolWithTag+0x43
fffff880`031739a0 fffff880`018450c6 nt!FsRtlUninitializeBaseMcb+0x41
fffff880`031739d0 fffff800`030d0355 Ntfs!NtfsMcbCleanupLruQueue+0xf6
fffff880`03173b70 fffff800`03362236 nt!ExpWorkerThread+0x111
fffff880`03173c00 fffff800`030b8706 nt!PspSystemThreadStartup+0x5a
fffff880`03173c40 00000000`00000000 nt!KxStartSystemThread+0x16

 

Another BSOD related to memory access strongly indicated problems with the memory modules.

Ran Memtest86+ overnight for 15+ hours.

The next day Memtest86+ had found 160 memory errors…

 

I decided to reseat the memory modules.

Then ran Memtest86+ overnight again for almost 16 hours.

The next day no memory errors were found.

Hoping that the cause and solution for the BSOD crashes has been found. Time will tell.

Examining MEMORY_MANAGEMENT (1a) BSOD

A Lenovo Thinkpad T440p computer recently crashed with a BSOD.

I started looking for clues in Event Viewer and found:

Log Name:      System
Source:        Microsoft-Windows-WER-SystemErrorReporting
Event ID:      1001
Task Category: None
Level:         Error
Keywords:      Classic
Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x0000001a (0x0000000000041792, 0xffff808000082f70, 0x0004000000000000, 0x0000000000000000). A dump was saved in: C:\WINDOWS\MEMORY.DMP.

 

Decided to examine the memory dump, so started WinDbg (x64) and opened:

C:\Windows\Memory.dmp

This message was displayed:

BugCheck 1A, {41792, ffff808000082f70, 4000000000000, 0}

Probably caused by : memory_corruption

 

Checked for more details with:

!analyze -v

Part of the result:

*************************************************************
*                                                           *
*                    Bugcheck Analysis                      *
*                                                           *
*************************************************************

MEMORY_MANAGEMENT (1a)
# Any other values for parameter 1 must be individually examined.
Arguments:
Arg1: 0000000000041792, A corrupt PTE has been detected. Parameter 2 contains the address of
the PTE. Parameters 3/4 contain the low/high parts of the PTE.
Arg2: ffff808000082f70
Arg3: 0004000000000000
Arg4: 0000000000000000

 

This issue indicated hardware failure, most likely defective memory.

So I booted Memtest86+ from a USB drive.

Within few minutes it found multiple errors.

 

Tried cleaning the contacts on the memory modules, but it had no effect.

 

Then I tested each memory module separately in both sockets.

In every case the memory test found errors.

 

Decided to test another 8 GB memory module.

Ran Memtest86+ all night and it found no errors on the replacement memory module.

Conclusion

A computer that crashes with a MEMORY_MANAGEMENT (1a) BSOD likely has defective memory.

Test the memory with Memtest86+ or another testprogram.

Then replace any identified defective memory modules.

Reseat and clean contacts on memory modules

Computer memory can fail and it usually causes reliability problems, unless ECC memory is used.

My preferred memory testing tool is: Memtest86+

 

Normal procedure is to test memory modules individually in different memory sockets, to identify the failing memory module or socket.

This entails reseating the memory modules. (Before handling memory modules please take anti-static precautions)

Modern memory buses use high frequency low voltage signals. They need a good electrical connection to work reliably.

Sometimes the process of reseating the memory modules can solve the problem, if it was caused by an electrical connection issue.

 

If memory tests continue to fail and the memory module is out of warranty, before replacing it you can try cleaning the contacts on the memory module for dust, dirt or corrosion.

I suggest using a piece of cloth with rubbing alcohol.

After cleaning, test the memory module again. If memory tests can run without errors for 24 hours, then the problem is likely fixed.