In 4½ months I have experienced 16 BSOD system crashes on a new work computer:
Crash Date | Bug Check String | Bug Check Code | Caused By Address |
---|---|---|---|
21-06-2017 | DRIVER_POWER_STATE_FAILURE | 0x0000009f | ntoskrnl.exe+70e40 |
12-06-2017 | NTFS_FILE_SYSTEM | 0x00000024 | Ntfs.sys+4211 |
23-05-2017 | IRQL_NOT_LESS_OR_EQUAL | 0x0000000a | ntoskrnl.exe+6f4c0 |
10-05-2017 | IRQL_NOT_LESS_OR_EQUAL | 0x0000000a | ntoskrnl.exe+6f440 |
01-05-2017 | BAD_POOL_HEADER | 0x00000019 | win32k.sys+f13b2 |
24-03-2017 | BAD_POOL_CALLER | 0x000000c2 | ntoskrnl.exe+6f440 |
17-03-2017 | SYSTEM_SERVICE_EXCEPTION | 0x0000003b | afd.sys+41448 |
14-03-2017 | MEMORY_MANAGEMENT | 0x0000001a | ntoskrnl.exe+70400 |
13-03-2017 | PAGE_FAULT_IN_NONPAGED_AREA | 0x00000050 | VBoxDrv.sys+1f037 |
10-03-2017 | PFN_LIST_CORRUPT | 0x0000004e | ntoskrnl.exe+70400 |
02-03-2017 | SYSTEM_SERVICE_EXCEPTION | 0x0000003b | ntoskrnl.exe+70400 |
22-02-2017 | BAD_POOL_CALLER | 0x000000c2 | TDI.SYS+10be |
17-02-2017 | BAD_POOL_HEADER | 0x00000019 | ntoskrnl.exe+70400 |
16-02-2017 | SYSTEM_THREAD_EXCEPTION_NOT_HANDLED | 0x1000007e | iusb3xhc.sys+7dfb0 |
08-02-2017 | PAGE_FAULT_IN_NONPAGED_AREA | 0x00000050 | ntoskrnl.exe+70400 |
07-02-2017 | PFN_LIST_CORRUPT | 0x0000004e | ntoskrnl.exe+70400 |
Until now I have:
- Performed multiple memory tests.
- Checked SSD health.
- Checked system files.
- Examined multiple memory dumps with WinDbg.
- Installed all relevant firmware and driver updates.
- Scanned for malware.
However this has not been successful or revealed the real cause behind the problems.
I eventually decided to replace the original memory modules:
2 x 8 GB DDR4-2133 CL15, Kingston KVR21N15D8K2/16
With:
2 x 8 GB DDR4-2133 CL15, Crucial CT8G4DFS8213.C8FDR1
This seemed to help somewhat.
System crashes used to be a semiweekly event.
After replacing the memory modules it became a semimonthly event.
The system has an Intel Skylake CPU (Core i7-6700)
It has recently been discovered that some Intel Skylake and Kaby Lake CPU’s have a hardware bug related to hyper-threading.
The bug is described in: 6th Generation Intel® Processor Family – Specification Update
Quote:
“Under complex micro-architectural conditions, short loops of less than 64 instructions that use AH, BH, CH or DH registers as well as their corresponding wider register (e.g. RAX, EAX or AX for AH) may cause unpredictable system behavior. This can only happen when both logical processors on the same physical processor are active.”
Until system vendors include microcode fixes in firmware/UEFI updates, the only workaround is to disable hyper-threading.
The stability problems I have experienced could be caused by this CPU hardware bug.
So I have disabled hyper-threading in BIOS/UEFI setup and will await firmware updates. I hope that the system will finally be stable and reliable.
Conclusion
If you have an Intel Skylake or Kaby Lake CPU, it is recommend to disable hyper-threading for now.