Recovering from Linux boot failure with fsck

A virtual Linux machine used for development was behaving unreliably.

Programs would fail to start and running program would freeze.

I decided to restart the system, however it failed to start up again.

 

It simply displayed this message:

BusyBox v1.22.1 (Ubuntu 1:1.22.0-15ubuntu1) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs)

 

I tried to exit the shell, which lead to this message:

/dev/sda1 contains a file system with errors, check forced.
/dev/sda1:
Inodes that were part of a corrupted orphan linked list found.

/dev/sda1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
        (i.e., without -a or -p options)
fsck exited with status code 4
The root filesystem on /dev/sda1 requires a manual fsck

 

File system corruption seemed to explain the reliability problems and the subsequent boot failure.

Followed the suggestion of running fsck with:

fsck /dev/sda1

 

Accepted all file system repairs suggested by fsck.

(Should have used the -y parameter)

 

Then rebooted the system with:

reboot

 

After the file system repairs the system booted and seemed fully functional.

Conclusion

If a Linux system fails to boot and only displays a BusyBox / initramfs prompt, try exiting the shell.

This may provide information about the actual problem.

Examining BAD_POOL_CALLER (c2) BSOD

My work computer recently crashed again with another BSOD.

 

Checked Event Viewer and found:

Log Name:      System
Source:        Microsoft-Windows-WER-SystemErrorReporting
Event ID:      1001
Level:         Error
Keywords:      Classic
Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x000000c2 (0x0000000000000007, 0x000000000000109b, 0x0000000000000000, 0xfffffa800cd9d010). A dump was saved in: C:\Windows\MEMORY.DMP.

 

Examined the memory dump with WinDbg (x64).

Checked for details about the crash with:

!analyze -v

Part of the result:

BAD_POOL_CALLER (c2)
The current thread is making a bad pool request.  Typically this is at a bad IRQL level or double freeing the same allocation, etc.
Arguments:
Arg1: 0000000000000007, Attempt to free pool which was already freed
Arg2: 000000000000109b, (reserved)
Arg3: 0000000000000000, Memory contents of the pool block
Arg4: fffffa800cd9d010, Address of the block of pool being deallocated

Debugging Details:
------------------

POOL_ADDRESS:  fffffa800cd9d010 Nonpaged pool

BUGCHECK_STR:  0xc2_7

DEFAULT_BUCKET_ID:  WIN7_DRIVER_FAULT

PROCESS_NAME:  vlc.exe

CURRENT_IRQL:  2

MODULE_NAME: avgtdia

IMAGE_NAME:  avgtdia.sys

 

Examined the call stack with:

kp

Result:

Child-SP          RetAddr           Call Site
fffff880`0db9b1f8 fffff800`031c3bf9 nt!KeBugCheckEx
fffff880`0db9b200 fffff880`01f729c5 nt!ExAllocatePoolWithTag+0x1951
fffff880`0db9b2b0 fffff880`04272775 avgtdia+0xb9c5
fffff880`0db9b330 fffff880`042407bb afd! ?? ::GFJBLGFE::`string'+0xd64c
fffff880`0db9b550 fffff800`033b028e afd!AfdFastIoDeviceControl+0x7ab
fffff880`0db9b8c0 fffff800`033b0896 nt!IopXxxControlFile+0x6be
fffff880`0db9ba00 fffff800`0308c693 nt!NtDeviceIoControlFile+0x56
fffff880`0db9ba70 00000000`73b12e09 nt!KiSystemServiceCopyEnd+0x13
00000000`045af0f8 00000000`00000000 0x73b12e09

 

The driver avgtdia.sys seemed to cause the crash.

 

Examined information about the avgtdia driver with:

lm v m avgtdia

Result:

start             end                 module name
fffff880`01f67000 fffff880`01fad000   avgtdia    (no symbols)
Loaded symbol image file: avgtdia.sys
Image path: \SystemRoot\system32\DRIVERS\avgtdia.sys
Image name: avgtdia.sys
Timestamp:        Wed Jul 27 15:24:36 2016 (5798B614)
CheckSum:         00053AED
ImageSize:        00046000
Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

 

Discovered that avgtdia.sys was: AVG Network connection watcher

 

This made me suspect that other BSOD crashes were also caused by AVG Internet Security:

Examining PFN_LIST_CORRUPT (4e) and PAGE_FAULT_IN_NONPAGED_AREA (50) BSOD

 

I decided to uninstall AVG Internet Security using: AVG Remover

Installed replacement: Avira Antivirus

 

I used to experience 2 BSOD crashes per week on this computer.

After uninstalling AVG Internet Security, the computer has been running for 1 week without any crashes…

I hope that the root cause has been identified and that the computer will finally be stable and reliable.

Conclusion

Common causes for computer stability problems are failing hard disks, defective memory and buggy drivers.

It seems that some antivirus products can also cause stability problems, possibly combined with specific drivers or other system level software.

Android Studio build failing due to Gradle NullPointerException

I recently experienced problems with Android Studio, where it would no longer clean or rebuild a particular project.

 

The Gradle Console showed this message:

FAILURE: Build failed with an exception.

* What went wrong:
java.lang.NullPointerException (no error message)

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.

 

Tried selecting: File menu -> Invalidate Caches / Restart…

Unfortunately this had no effect.

 

Looked for more details by building from a commandline with the suggested options:

gradlew build --stacktrace --debug > build_debug_log.txt 2> build_debug_error.txt

 

(It was neccessary to log standard output and error output separately, because otherwise they would get mixed up)

However the Gradle output still didn’t explain clearly why the build process was failing.

 

Found a suggestion to delete the .gradle folder in the project folder here:

http://stackoverflow.com/questions/39183674/java-lang-nullpointerexception-no-error-message

 

I closed the Android Studio project, moved the .gradle folder outside of the project folder and reopened the Android Studio project.

This solved the problem!

It was now possible to clean and rebuild the project again.

Conclusion

If Android Studio refuses to build due to NullPointerException from Gradle, try removing or moving the .gradle folder from the affected project.

Examining SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e) BSOD

My work computer recently crashed with a BSOD just after inserting a USB 3.0 memory stick.

Considering the circumstances I suspected that a USB driver bug caused the crash.

 

Checked Event Viewer and found:

Log Name:      System
Source:        Microsoft-Windows-WER-SystemErrorReporting
Event ID:      1001
Task Category: None
Level:         Error
Keywords:      Classic
Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x0000007e (0xffffffffc0000005, 0xfffff88001e685fe, 0xfffff8800394e5a8, 0xfffff8800394de00). A dump was saved in: C:\Windows\MEMORY.DMP.

 

Examined the memory dump with WinDbg (x64).

Checked for details about the crash with:

!analyze -v

Part of the result:

SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common bugcheck.  Usually the exception address pinpoints
the driver/function that caused the problem.  Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: fffff88001e685fe, The address that the exception occurred at
Arg3: fffff8800394e5a8, Exception Record Address
Arg4: fffff8800394de00, Context Record Address

Debugging Details:
------------------

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

FAULTING_IP:
iusb3hub+235fe
fffff880`01e685fe 4c8b00          mov     r8,qword ptr [rax]

EXCEPTION_RECORD:  fffff8800394e5a8 -- (.exr 0xfffff8800394e5a8)
ExceptionAddress: fffff88001e685fe (iusb3hub+0x00000000000235fe)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 0000000000000000
Parameter[1]: 0000000000000000
Attempt to read from address 0000000000000000

MODULE_NAME: iusb3hub

IMAGE_NAME:  iusb3hub.sys

 

Examined the call stack with:

kp

Result:

Child-SP          RetAddr           Call Site
fffff880`0394d5d8 fffff800`0344cf24 nt!KeBugCheckEx
fffff880`0394d5e0 fffff800`0340a745 nt!PspUnhandledExceptionInSystemThread+0x24
fffff880`0394d620 fffff800`03101cb4 nt! ?? ::NNGAKEGL::`string'+0x21dc
fffff880`0394d650 fffff800`0310172d nt!_C_specific_handler+0x8c
fffff880`0394d6c0 fffff800`03100505 nt!RtlpExecuteHandlerForException+0xd
fffff880`0394d6f0 fffff800`03111a05 nt!RtlDispatchException+0x415
fffff880`0394ddd0 fffff800`030d5a82 nt!KiDispatchException+0x135
fffff880`0394e470 fffff800`030d45fa nt!KiExceptionDispatch+0xc2
fffff880`0394e650 fffff880`01e685fe nt!KiPageFault+0x23a
fffff880`0394e7e0 fffff880`01e4a2b6 iusb3hub+0x235fe
fffff880`0394e840 fffff880`01e4a055 iusb3hub+0x52b6
fffff880`0394e8b0 fffff880`01e4a7fd iusb3hub+0x5055
fffff880`0394e920 fffff880`01e5c9a7 iusb3hub+0x57fd
fffff880`0394e980 fffff880`01e5c3e4 iusb3hub+0x179a7
fffff880`0394ea90 fffff880`01e69b3b iusb3hub+0x173e4
fffff880`0394eb10 fffff800`033d2413 iusb3hub+0x24b3b
fffff880`0394eb40 fffff800`030df355 nt!IopProcessWorkItem+0x23
fffff880`0394eb70 fffff800`03371236 nt!ExpWorkerThread+0x111
fffff880`0394ec00 fffff800`030c7706 nt!PspSystemThreadStartup+0x5a
fffff880`0394ec40 00000000`00000000 nt!KxStartSystemThread+0x16

 

Apparently iusb3hub.sys caused an access violation by reading from address 0 (null pointer bug).

 

Examined information about the iusb3hub driver with:

lmv m iusb3hub

Result:

start             end                 module name
fffff880`01e45000 fffff880`01eaa000   iusb3hub   (no symbols)
Loaded symbol image file: iusb3hub.sys
Image path: \SystemRoot\system32\DRIVERS\iusb3hub.sys
Image name: iusb3hub.sys
Timestamp:        Fri Dec 18 16:59:07 2015 (56742D4B)
CheckSum:         0006D07A
ImageSize:        00065000
Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

 

Noticed that the driver was more than 1 year old.

Found details about “Intel(R) USB 3.0 Root Hub” in Device Manager.

 

Decided to search for an updated driver.

Installed and ran Intel Driver Update Utility, which found a newer USB 3.0 driver (5.0.0.32)

Installed the updated driver and rebooted the system.

Hoping that this will prevent the computer from crashing in the future.

USB driver problem preventing access to Samsung Android devices

I recently experienced problems connecting to Samsung Android devices with Android Studio from my work computer.

No Connected Devices were available.

 

I checked Device Manager and noticed a warning for: SAMSUNG Mobile USB Composite Device

 

Checked Properties and noticed the Device status:

Windows cannot load the device driver for this hardware. The driver may be corrupted or missing. (Code 39)

 

I checked driver details and noticed that the driver was unexpectedly: usbpcap.sys.

(The problem occured after installing Wireshark and USBPcap…)

 

I decided to uninstall USBPcap. However this didn’t solve the problem, but changed the message for driver details to:

No driver files are required or have been loaded for this device.

 

Fixed the problem this way:

1. Clicked: Update Driver…

 

2. Clicked: Browse my computer for driver software

 

3. Clicked: Let me pick from a list of device drivers on my computer

 

4. Selected: SAMSUNG Mobile USB Composite Device Version: 2.12.4.0 [24-08-2016]

 

5. Clicked: Next

6. Noticed the message: Windows has successfully updated your driver software

 

7. Checked driver details, which now had the desired driver file:

C:\Windows\system32\DRIVERS\ssudbus.sys

 

This fixed the problem. It was again possible to connect to Samsung Android devices from Android Studio.

Examining PFN_LIST_CORRUPT (4e) and PAGE_FAULT_IN_NONPAGED_AREA (50) BSOD

I recently experienced stability problems on a new work computer, which crashed with a BSOD.

 

I looked for clues in Event Viewer and found:

Log Name:      System
Source:        Microsoft-Windows-WER-SystemErrorReporting
Event ID:      1001
Task Category: None
Level:         Error
Keywords:      Classic
Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x0000004e (0x0000000000000099, 0x00000000003def55, 0x0000000000000000, 0x0000000000000001). A dump was saved in: C:\Windows\MEMORY.DMP.

 

Examined the memory dump with WinDbg (x64).

Checked for details about the crash with:

!analyze -v

Part of the result:

PFN_LIST_CORRUPT (4e)
Typically caused by drivers passing bad memory descriptor lists (ie: calling
MmUnlockPages twice with the same list, etc).  If a kernel debugger is
available get the stack trace.
Arguments:
Arg1: 0000000000000099, A PTE or PFN is corrupt
Arg2: 00000000003def55, page frame number
Arg3: 0000000000000000, current page state
Arg4: 0000000000000001, 0

 

Examined the call stack with:

kp

Result:

Child-SP          RetAddr           Call Site
fffff880`030b34f8 fffff800`0311c37c nt!KeBugCheckEx
fffff880`030b3500 fffff800`03038c17 nt!MiBadShareCount+0x4c
fffff880`030b3540 fffff800`030bc057 nt! ?? ::FNODOBFM::`string'+0x2cf6d
fffff880`030b36f0 fffff800`030bda09 nt!MiDeleteVirtualAddresses+0x41f
fffff880`030b38b0 fffff800`033a9f21 nt!MiRemoveMappedView+0xd9
fffff880`030b39d0 fffff800`033aa323 nt!MiUnmapViewOfSection+0x1b1
fffff880`030b3a90 fffff800`03089693 nt!NtUnmapViewOfSection+0x5f
fffff880`030b3ae0 00000000`76febfda nt!KiSystemServiceCopyEnd+0x13
00000000`0a8df5d8 00000000`00000000 0x76febfda

 

Memory problems are typically caused by failing memory modules, so I tested the memory with Memtest86+.

Only had time for running it for a short time, but it passed the test once.

However the next day the computer crashed again with another BSOD…

 

I found this in Event Viewer:

Log Name:      System
Source:        Microsoft-Windows-WER-SystemErrorReporting
Event ID:      1001
Task Category: None
Level:         Error
Keywords:      Classic
Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x00000050 (0xfffff8a0384b1280, 0x0000000000000000, 0xfffff800031fe133, 0x0000000000000000). A dump was saved in: C:\Windows\MEMORY.DMP.

 

Examined the new memory dump with:

!analyze -v

Part of the result:

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except,
it must be protected by a Probe.  Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: fffff8a0384b1280, memory referenced.
Arg2: 0000000000000000, value 0 = read operation, 1 = write operation.
Arg3: fffff800031fe133, If non-zero, the instruction address which referenced the bad memory
address.
Arg4: 0000000000000000, (reserved)

 

Examined the call stack with:

kp

Result:

Child-SP          RetAddr           Call Site
fffff880`031735f8 fffff800`031442be nt!KeBugCheckEx
fffff880`03173600 fffff800`030c552e nt! ?? ::FNODOBFM::`string'+0x3bc5f
fffff880`03173760 fffff800`031fe133 nt!KiPageFault+0x16e
fffff880`031738f0 fffff800`030af3b1 nt!ExFreePoolWithTag+0x43
fffff880`031739a0 fffff880`018450c6 nt!FsRtlUninitializeBaseMcb+0x41
fffff880`031739d0 fffff800`030d0355 Ntfs!NtfsMcbCleanupLruQueue+0xf6
fffff880`03173b70 fffff800`03362236 nt!ExpWorkerThread+0x111
fffff880`03173c00 fffff800`030b8706 nt!PspSystemThreadStartup+0x5a
fffff880`03173c40 00000000`00000000 nt!KxStartSystemThread+0x16

 

Another BSOD related to memory access strongly indicated problems with the memory modules.

Ran Memtest86+ overnight for 15+ hours.

The next day Memtest86+ had found 160 memory errors…

 

I decided to reseat the memory modules.

Then ran Memtest86+ overnight again for almost 16 hours.

The next day no memory errors were found.

Hoping that the cause and solution for the BSOD crashes has been found. Time will tell.

Wireshark can hang when stopping capture

I have experienced problems with recent Wireshark versions on Windows, including the current latest version 2.2.3.

 

The way I normally use Wireshark is to capture the traffic of interest, then stop the capture and finally analyze the traffic.

However Wireshark on Windows can hang when stopping the capture…

 

When it happens the Wireshark UI becomes non-responsive:

 

The main thread runs at 100% CPU utilization.

 

And Process Monitor shows that the Wireshark process reads the “End Of File” from the same temporary file over and over:

 

Started debugging the issue by taking multiple memory dumps of the wireshark.exe process using procdump with:

procdump -ma -n 10 -s 1 wireshark.exe

 

Opened the first memory dump in WinDbg (x64).

Checked the call stack for the main thread (0) with:

~0 kp

Result:

# Child-SP          RetAddr           Call Site
00 00000039`a00f9fc8 00007ffc`e5dbc264 ntdll!NtReadFile+0x14
01 00000039`a00f9fd0 00007ffc`d9ca9aa7 KERNELBASE!ReadFile+0x74
02 00000039`a00fa050 00007ffc`d9ca9782 msvcr120!_read_nolock(int fh = 0n8, void * inputbuf = 0x000001a8`3e57bd20, unsigned int cnt = 0x1000)+0x2cf [f:\dd\vctools\crt\crtw32\lowio\read.c @ 256]
03 00000039`a00fa0f0 00007ffc`a14f09b3 msvcr120!_read(int fh = 0n8, void * buf = 0x000001a8`3e57bd20, unsigned int cnt = 0x1000)+0xc6 [f:\dd\vctools\crt\crtw32\lowio\read.c @ 92]
04 00000039`a00fa140 00007ffc`a14efde4 libwiretap!file_tell+0xc93
05 00000039`a00fa170 00007ffc`a14ef98c libwiretap!file_tell+0xc4
06 00000039`a00fa1a0 00007ffc`a1517c8d libwiretap!file_read+0xac
07 00000039`a00fa1e0 00007ffc`a150abb3 libwiretap!wtap_read_bytes_or_eof+0x2d
08 00000039`a00fa210 00007ffc`a150a9a9 libwiretap!wtap_wtap_encap_to_pcap_encap+0xbd3
09 00000039`a00fa290 00007ffc`a1517b77 libwiretap!wtap_wtap_encap_to_pcap_encap+0x9c9
0a 00000039`a00fa310 00007ff7`c7929fd3 libwiretap!wtap_read+0x37
0b 00000039`a00fa350 00007ff7`c7b7cfd7 Wireshark+0x9fd3
0c 00000039`a00fa3a0 00007ff7`c7b963dd Wireshark+0x25cfd7
0d 00000039`a00fa3e0 00007ff7`c798889a Wireshark+0x2763dd
0e 00000039`a00fb470 00007ff7`c79d5144 Wireshark+0x6889a
0f 00000039`a00fb4c0 00000000`5b94f906 Wireshark+0xb5144
10 00000039`a00fb5c0 00000000`5b9c4d66 Qt5Core!QMetaObject::activate+0x5a6
11 00000039`a00fb6d0 00000000`5b95413a Qt5Core!QTimer::timeout+0x16
12 00000039`a00fb700 00000000`5bcf7d12 Qt5Core!QObject::event+0x6a
13 00000039`a00fb8b0 00000000`5bcf6c2f Qt5Widgets!QApplicationPrivate::notify_helper+0x112
14 00000039`a00fb8e0 00000000`5b92f689 Qt5Widgets!QApplication::notify+0x1b3f
15 00000039`a00fc000 00000000`5b977a8c Qt5Core!QCoreApplication::notifyInternal2+0xb9
16 00000039`a00fc080 00000000`5b976a32 Qt5Core!QEventDispatcherWin32Private::sendTimerEvent+0x10c
17 00000039`a00fc0e0 00007ffc`e6851c24 Qt5Core!QEventDispatcherWin32::processEvents+0xd82
18 00000039`a00fc1f0 00007ffc`e685156c user32!UserCallWinProcCheckWow+0x274
19 00000039`a00fc350 00000000`5b9761d9 user32!DispatchMessageWorker+0x1ac
1a 00000039`a00fc3d0 00007ffc`a14029b9 Qt5Core!QEventDispatcherWin32::processEvents+0x529
1b 00000039`a00ff760 00000000`5b92bf91 qwindows!qt_plugin_query_metadata+0x2499
1c 00000039`a00ff790 00000000`5b92e477 Qt5Core!QEventLoop::exec+0x1b1
1d 00000039`a00ff810 00007ff7`c7929ccd Qt5Core!QCoreApplication::exec+0x147
1e 00000039`a00ff880 00007ff7`c7ba2ac5 Wireshark+0x9ccd
1f 00000039`a00ffd50 00007ff7`c7ba22fd Wireshark+0x282ac5
20 00000039`a00ffde0 00007ffc`e87c8364 Wireshark+0x2822fd
21 00000039`a00ffe20 00007ffc`e8f470d1 kernel32!BaseThreadInitThunk+0x14
22 00000039`a00ffe50 00000000`00000000 ntdll!RtlUserThreadStart+0x21

 

Checked the main thread (0) call stack for all the memory dumps by scripting CDB, the console version of WinDbg.

Used this PowerShell script:

$dmpPath = 'C:\Bin\Procdump\Wireshark\'
$dmpFiles = Get-ChildItem -Path $dmpPath -Recurse -Include *.dmp

foreach ($dmpFile in $dmpFiles)
{
    & "C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\cdb.exe" -z $dmpFile -c "~0 kp; Q"
}

 

Noticed that all the memory dumps had the same call stack for thread 0, which further indicated that WireShark was stuck or running in an endless loop.

 

Now I wanted to identify if this was a hang or endless loop.

Attached to the running, but non-responsive Wireshark process with WinDbg (x64).

Experimented by setting breakpoints from the initial call stack.

 

Set a single breakpoint at:

MSVCR120!_read_nolock(...)+0x2cf

With:

bc *
bp MSVCR120!_read_nolock+0x2cf

Result after continue: Breakpoint was continually hit.

 

This definitely indicated an endless loop.

 

Set a single breakpoint at:

libwiretap!wtap_read_bytes_or_eof+0x2d

With:

bc *
bp libwiretap!wtap_read_bytes_or_eof+0x2d

Result after continue: Breakpoint was continually hit.

 

Set a single breakpoint at:

Qt5Core!QTimer::timeout+0x16

With:

bc *
bp Qt5Core!QTimer::timeout+0x16

Result after continue: Breakpoint was not hit (within 1 minute)

 

Set single breakpoints at the various Wireshark functions.

The breakpoint was not hit until setting a single breakpoint at:

Wireshark+0x9fd3

With:

bc *
bp Wireshark+0x9fd3

Result after continue: The breakpoint was continually hit.

 

This indicated that the endless loop occured in the Wireshark module.

 

Had a quick look at the disassembly and tried single stepping with:

p

The code was definitely looping, but the problem was not obvious by looking at the machine code.

 

Went looking for debug symbols and found them here:

https://1.eu.dl.wireshark.org/win64/all-versions/Wireshark-pdb-win64-2.2.3.zip

Downloaded the debug symbols and unpacked them to a temporary folder.

 

Then modified the symbol path in WinDbg with:

.sympath C:\Temp\WSSymbols\;srv*c:\SymbolsCache*https://msdl.microsoft.com/download/symbols

And reloaded the symbols with:

.reload /f

 

Checked the call stack again with:

~0 kp

Result:

# Child-SP          RetAddr           Call Site
00 00000039`a00f9fc8 00007ffc`e5dbc264 ntdll!NtReadFile+0x14
01 00000039`a00f9fd0 00007ffc`d9ca9aa7 KERNELBASE!ReadFile+0x74
02 00000039`a00fa050 00007ffc`d9ca9782 MSVCR120!_read_nolock(int fh = 0n8, void * inputbuf = 0x000001a8`3e57bd20, unsigned int cnt = 0x1000)+0x2cf [f:\dd\vctools\crt\crtw32\lowio\read.c @ 256]
03 00000039`a00fa0f0 00007ffc`a14f09b3 MSVCR120!_read(int fh = 0n8, void * buf = 0x000001a8`3e57bd20, unsigned int cnt = 0x1000)+0xc6 [f:\dd\vctools\crt\crtw32\lowio\read.c @ 92]
04 00000039`a00fa140 00007ffc`a14efde4 libwiretap!raw_read(struct wtap_reader * state = 0x000001a8`4ad0ebd0, unsigned char * buf = 0x000001a8`3e57bd20 "vided by dumpcap???", unsigned int count = 0x1000, unsigned int * have = 0x000001a8`4ad0ec08)+0x43 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\file_wrappers.c @ 133]
05 00000039`a00fa170 00007ffc`a14ef98c libwiretap!fill_out_buffer(struct wtap_reader * state = 0x000001a8`4ad0ebd0)+0x44 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\file_wrappers.c @ 704]
06 00000039`a00fa1a0 00007ffc`a1517c8d libwiretap!file_read(void * buf = 0x00000039`a00fa250, unsigned int len = 8, struct wtap_reader * file = 0x000001a8`4ad0ebd0)+0xac [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\file_wrappers.c @ 1237]
07 00000039`a00fa1e0 00007ffc`a150abb3 libwiretap!wtap_read_bytes_or_eof(struct wtap_reader * fh = 0x000001a8`4ad0ebd0, void * buf = <Value unavailable error>, unsigned int count = 8, int * err = 0x00000039`a00fa3a0, char ** err_info = 0x00000039`a00fa3b0)+0x2d [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\wtap.c @ 1291]
08 00000039`a00fa210 00007ffc`a150a9a9 libwiretap!pcapng_read_block(struct wtap * wth = 0x000001a8`41f48790, struct wtap_reader * fh = 0x000001a8`4ad0ebd0, struct pcapng_t * pn = 0x000001a8`41f1ca40, struct wtapng_block_s * wblock = 0x00000039`a00fa2c0, int * err = 0x00000039`a00fa3a0, char ** err_info = 0x00000039`a00fa3b0)+0x53 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\pcapng.c @ 2303]
09 00000039`a00fa290 00007ffc`a1517b77 libwiretap!pcapng_read(struct wtap * wth = 0x000001a8`41f48790, int * err = 0x00000039`a00fa3a0, char ** err_info = 0x00000039`a00fa3b0, int64 * data_offset = 0x00000039`a00fa3a8)+0x79 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\pcapng.c @ 2589]
0a 00000039`a00fa310 00007ff7`c7929fd3 libwiretap!wtap_read(struct wtap * wth = 0x000001a8`41f48790, int * err = 0x00000039`a00fa3a0, char ** err_info = 0x00000039`a00fa3b0, int64 * data_offset = <Value unavailable error>)+0x37 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\wtap.c @ 1237]
0b 00000039`a00fa350 00007ff7`c7b7cfd7 Wireshark!capture_info_new_packets(int to_read = 0n14, struct _info_data * cap_info = 0x000001a8`3f1838b8)+0x43 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\capture_info.c @ 211]
0c 00000039`a00fa3a0 00007ff7`c7b963dd Wireshark!capture_input_new_packets(struct _capture_session * cap_session = 0x000001a8`3f183878, int to_read = 0n23)+0xa7 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\ui\capture.c @ 407]
0d 00000039`a00fa3e0 00007ff7`c798889a Wireshark!sync_pipe_input_cb(int source = 0n4, void * user_data = 0x000001a8`3f183878)+0x1bd [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\capchild\capture_sync.c @ 1775]
0e 00000039`a00fb470 00007ff7`c79d5144 Wireshark!MainWindow::pipeTimeout(void)+0x8a [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\ui\qt\main_window_slots.cpp @ 949]
0f 00000039`a00fb4c0 00000000`5b94f906 Wireshark!MainWindow::qt_static_metacall(class QObject * _o = 0x000001a8`3f1836e0, QMetaObject::Call _c = <Value unavailable error>, int _id = <Value unavailable error>, void ** _a = 0x00000039`a00fb5f8)+0x4c4 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\cmbuild\ui\qt\moc_main_window.cpp @ 1436]
10 00000039`a00fb5c0 00000000`5b9c4d66 Qt5Core!QMetaObject::activate+0x5a6
11 00000039`a00fb6d0 00000000`5b95413a Qt5Core!QTimer::timeout+0x16
12 00000039`a00fb700 00000000`5bcf7d12 Qt5Core!QObject::event+0x6a
13 00000039`a00fb8b0 00000000`5bcf6c2f Qt5Widgets!QApplicationPrivate::notify_helper+0x112
14 00000039`a00fb8e0 00000000`5b92f689 Qt5Widgets!QApplication::notify+0x1b3f
15 00000039`a00fc000 00000000`5b977a8c Qt5Core!QCoreApplication::notifyInternal2+0xb9
16 00000039`a00fc080 00000000`5b976a32 Qt5Core!QEventDispatcherWin32Private::sendTimerEvent+0x10c
17 00000039`a00fc0e0 00007ffc`e6851c24 Qt5Core!QEventDispatcherWin32::processEvents+0xd82
18 00000039`a00fc1f0 00007ffc`e685156c USER32!UserCallWinProcCheckWow+0x274
19 00000039`a00fc350 00000000`5b9761d9 USER32!DispatchMessageWorker+0x1ac
1a 00000039`a00fc3d0 00007ffc`a14029b9 Qt5Core!QEventDispatcherWin32::processEvents+0x529
1b 00000039`a00ff760 00000000`5b92bf91 qwindows!qt_plugin_query_metadata+0x2499
1c 00000039`a00ff790 00000000`5b92e477 Qt5Core!QEventLoop::exec+0x1b1
1d 00000039`a00ff810 00007ff7`c7929ccd Qt5Core!QCoreApplication::exec+0x147
1e 00000039`a00ff880 00007ff7`c7ba2ac5 Wireshark!main(int argc = 0n1, char ** qt_argv = <Value unavailable error>)+0xe3d [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wireshark-qt.cpp @ 853]
1f 00000039`a00ffd50 00007ff7`c7ba22fd Wireshark!WinMain+0x155
20 00000039`a00ffde0 00007ffc`e87c8364 Wireshark!__tmainCRTStartup(void)+0x149 [f:\dd\vctools\crt\crtw32\dllstuff\crtexe.c @ 618]
21 00000039`a00ffe20 00007ffc`e8f470d1 KERNEL32!BaseThreadInitThunk+0x14
22 00000039`a00ffe50 00000000`00000000 ntdll!RtlUserThreadStart+0x21

 

The symbols seemed to be correct and functional.

 

Wanted to verify the location of the endless loop by setting breakpoints again.

 

Set a single breakpoint at:

Wireshark!capture_input_new_packets

With:

bc *
bp Wireshark!capture_input_new_packets

Result after continue: Breakpoint was not hit (within 1 minute)

 

Set a single breakpoint at:

Wireshark!capture_info_new_packets

With:

bc *
bp Wireshark!capture_info_new_packets

Result after continue: Breakpoint was not hit (within 1 minute)

 

Set a single breakpoint at:

libwiretap!wtap_read

With:

bc *
bp libwiretap!wtap_read

Result after continue: The breakpoint was continually hit.

 

This result indicated that the endless loop was in Wireshark!capture_info_new_packets(…) calling libwiretap!wtap_read(…), specifically here:

Wireshark!capture_info_new_packets(int to_read = 0n14, struct _info_data * cap_info = 0x000001a8`3f1838b8)+0x43 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\capture_info.c @ 211]

 

Downloaded the source code for Wireshark from:

https://1.eu.dl.wireshark.org/src/wireshark-2.2.3.tar.bz2

 

Looked for and opened: capture_info.c

Found the method, which has been included here:

/* new packets arrived */
void capture_info_new_packets(int to_read, info_data_t* cap_info)
{
    int err;
    gchar *err_info;
    gint64 data_offset;
    struct wtap_pkthdr *phdr;
    union wtap_pseudo_header *pseudo_header;
    int wtap_linktype;
    const guchar *buf;

    cap_info->ui.new_packets = to_read;

    /*g_warning("new packets: %u", to_read);*/

    while (to_read > 0) {
        wtap_cleareof(cap_info->wtap);
        if (wtap_read(cap_info->wtap, &err, &err_info, &data_offset)) {
            phdr = wtap_phdr(cap_info->wtap);
            pseudo_header = &phdr->pseudo_header;
            wtap_linktype = phdr->pkt_encap;
            buf = wtap_buf_ptr(cap_info->wtap);

            capture_info_packet(cap_info, wtap_linktype, buf, phdr->caplen, pseudo_header);

            /*g_warning("new packet");*/
            to_read--;
        }
    }

    capture_info_ui_update(&cap_info->ui);
}

 

I have marked some points of interest with bold.

 

The while loop continues as long as to_read is greater than 0.

Noticed that the to_read variable continued to have the value 14.

to_read is only decremented in case wtap_read returns a non-zero value.

It seems that this never happens, when the endless loop occurs in Wireshark.

 

I have noticed that the problem apparently doesn’t occur with Wireshark Legacy based on GTK (at least with default settings).

I wanted to determine the cause for the difference.

 

Examined symbols for Wireshark (QT) with:

x /D /f Wireshark!capture_info_new_packets

Result:

00007ff7`c7929f90 Wireshark!capture_info_new_packets (int, struct _info_data *)

 

Examined symbols for Wireshark Legacy (GTK) with:

x /D /f Wireshark_gtk!capture_info_new_packets

Result:

00007ff7`c03310f0 Wireshark_gtk!capture_info_new_packets (int, struct _info_data *)

 

Both versions include the function capture_info_new_packets (…)

 

Checked where the function was mentioned in the source code with grep (from Cygwin):

grep "capture_info_new_packets" -r

Result:

capture_info.c:void capture_info_new_packets(int to_read, info_data_t* cap_info)
capture_info.h:extern void capture_info_new_packets(int to_read, info_data_t* cap_info);
ui/capture.c:    capture_info_new_packets(to_read, cap_session->cap_data_info);

 

Checked the content of ui/capture.c and found:

if(capture_opts->show_info)
    capture_info_new_packets(to_read, cap_session->cap_data_info);

 

So the capture_info_new_packets(…) function will only be called if the show_info option is true…

 

Looked for places where show_info was used with:

grep "show_info" -r

Part of the result was:

ui/gtk/capture_dlg.c:  gtk_toggle_button_set_active(GTK_TOGGLE_BUTTON(hide_info_cb), !global_capture_opts.show_info);
ui/gtk/capture_dlg.c:  global_capture_opts.show_info =
ui/qt/capture_interfaces_dialog.cpp:    global_capture_opts.show_info = checked;
ui/qt/capture_interfaces_dialog.cpp:    ui->cbExtraCaptureInfo->setChecked(global_capture_opts.show_info);

 

For Wireshark Legacy (GTK) I checked the content of ui/gtk/capture_dlg.c and found:

global_capture_opts.show_info =
    !gtk_toggle_button_get_active(GTK_TOGGLE_BUTTON(hide_info_cb));

 

For Wireshark Legacy (GTK) the endless loop problem can be avoided by enabling this option in the Capture Options dialog :

Hide capture info dialog   (default setting)

 

For Wireshark (QT) I checked the content of ui/qt/capture_interfaces_dialog.cpp and found:

void CaptureInterfacesDialog::on_cbExtraCaptureInfo_toggled(bool checked)
{
    global_capture_opts.show_info = checked;
}

 

For Wireshark (QT) the endless loop problem can be avoided by disabling this option in the Capture Interfaces dialog :

Show extra capture information dialog   (not default setting)

 

Conclusion

The endless loop problem occurs consistently depending on the identified settings, so the problem cause has likely been found along with a workaround.

Until the endless loop bug has been fixed in Wireshark, it is recommended to set the Wireshark options like described above.

Examining DRIVER_CORRUPTED_EXPOOL (c5) BSOD

My work computer recently crashed with a BSOD, when disconnecting or reconnecting Cisco AnyConnect Secure Mobility Client.

Considering the circumstances I suspected that Cisco AnyConnect was the culprit, but I wanted to confirm this.

 

I started looking for information in Event Viewer and found:

Log Name:      System
Source:        Microsoft-Windows-WER-SystemErrorReporting
Event ID:      1001
Task Category: None
Level:         Error
Keywords:      Classic
Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x000000c5 (0x00000000760e0002, 0x0000000000000002, 0x0000000000000000, 0xfffff802fba61850). A dump was saved in: C:\WINDOWS\MEMORY.DMP.

 

Needed to examine the memory dump, so started WinDbg (x64) and opened:

C:\Windows\Memory.dmp

This message was displayed:

BugCheck C5, {760e0002, 2, 0, fffff802fba61850}

*** ERROR: Module load completed but symbols could not be loaded for acsock64.sys
Probably caused by : memory_corruption

 

Checked for more details with:

!analyze -v

Part of the result:

DRIVER_CORRUPTED_EXPOOL (c5)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is
caused by drivers that have corrupted the system pool.  Run the driver
verifier against any new (or suspect) drivers, and if that doesn't turn up
the culprit, then use gflags to enable special pool.
Arguments:
Arg1: 00000000760e0002, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
Arg4: fffff802fba61850, address which referenced memory

 

The crash definitely seemed to be caused by a driver bug.

Now I wanted to identify which driver caused the problem.

 

Checked the call stack with:

kc

Result:

# Call Site
00 nt!KeBugCheckEx
01 nt!KiBugCheckDispatch
02 nt!KiPageFault
03 nt!ExDeferredFreePool
04 nt!ExFreePoolWithTag
05 acsock64
06 acsock64
07 acsock64
08 NETIO!ProcessCallout
09 NETIO!ProcessFastCalloutClassify
0a NETIO!KfdClassify
0b tcpip!AleNotifyEndpointTeardown
0c tcpip!UdpCleanupEndpointWorkQueueRoutine
0d tcpip!UdpCloseEndpoint
0e afd!AfdTLCloseEndpoint
0f afd!AfdCloseTransportEndpoint
10 afd!AfdCleanupCore
11 afd!AfdDispatch
12 nt!IopCloseFile
13 nt!ObCloseHandleTableEntry
14 nt!NtClose
15 nt!KiSystemServiceCopyEnd
16 0x0

 

Noticed that the acsock64 calls occured just before the crash.

 

Examined available information about acsock64 with:

lmv m acsock64

Part of the result:

start             end                 module name
fffff807`453d0000 fffff807`45406000   acsock64   (no symbols)
Loaded symbol image file: acsock64.sys
Image path: \SystemRoot\system32\DRIVERS\acsock64.sys
Image name: acsock64.sys
Timestamp:        Thu Oct 08 17:12:56 2015 (561687F8)
CheckSum:         0003DD8B
ImageSize:        00036000
Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

 

Noticed that the driver was around 14 months old at the time of writing.

 

Found the driver file in Windows explorer under:

C:\Windows\system32\DRIVERS\acsock64.sys

Then checked Properties. The details view revealed that the driver was:

Cisco AnyConnect Kernel Driver Framework Socket Layer Interceptor

Version: 4.2.1009.0

 

This confirmed my suspicion, Cisco AnyConnect most likely caused the BSOD.

 

At this point I would have updated to the latest version, if updates to Cisco AnyConnect were freely available.

Instead I decided to install and use the Windows port of OpenConnect as an alternative to Cisco AnyConnect.

Examining MEMORY_MANAGEMENT (1a) BSOD

A Lenovo Thinkpad T440p computer recently crashed with a BSOD.

I started looking for clues in Event Viewer and found:

Log Name:      System
Source:        Microsoft-Windows-WER-SystemErrorReporting
Event ID:      1001
Task Category: None
Level:         Error
Keywords:      Classic
Description:
The computer has rebooted from a bugcheck.  The bugcheck was: 0x0000001a (0x0000000000041792, 0xffff808000082f70, 0x0004000000000000, 0x0000000000000000). A dump was saved in: C:\WINDOWS\MEMORY.DMP.

 

Decided to examine the memory dump, so started WinDbg (x64) and opened:

C:\Windows\Memory.dmp

This message was displayed:

BugCheck 1A, {41792, ffff808000082f70, 4000000000000, 0}

Probably caused by : memory_corruption

 

Checked for more details with:

!analyze -v

Part of the result:

*************************************************************
*                                                           *
*                    Bugcheck Analysis                      *
*                                                           *
*************************************************************

MEMORY_MANAGEMENT (1a)
# Any other values for parameter 1 must be individually examined.
Arguments:
Arg1: 0000000000041792, A corrupt PTE has been detected. Parameter 2 contains the address of
the PTE. Parameters 3/4 contain the low/high parts of the PTE.
Arg2: ffff808000082f70
Arg3: 0004000000000000
Arg4: 0000000000000000

 

This issue indicated hardware failure, most likely defective memory.

So I booted Memtest86+ from a USB drive.

Within few minutes it found multiple errors.

 

Tried cleaning the contacts on the memory modules, but it had no effect.

 

Then I tested each memory module separately in both sockets.

In every case the memory test found errors.

 

Decided to test another 8 GB memory module.

Ran Memtest86+ all night and it found no errors on the replacement memory module.

Conclusion

A computer that crashes with a MEMORY_MANAGEMENT (1a) BSOD likely has defective memory.

Test the memory with Memtest86+ or another testprogram.

Then replace any identified defective memory modules.

Cleaning computer cooling system to improve performance

While using a laptop computer I noticed high noise levels, caused by the cooling fan.

I decided to check the CPU temperatures using HWMonitor.

The idle temperatures were around 70° C.

And load temperatures were around 85°-94° C.

 

These temperatures could be high enough to cause thermal throttling, which would affect system performance.

Therefore I measured performance with 7-Zip and CPU-Z benchmarks.

 

I decided to disassemble the laptop computer and clean the cooling system, using compressed air.

(Be aware that most compressed air cans contain harmful chemicals, so use them outside or in a well ventilated area)

 

Cleaning the cooling system had a significant effect.

Now idle temperatures were much lower, around 55° C.

And load temperatures were also lower, around 74°-85° C.

Also noticed much lower fan speeds, so the computer wasn’t as noisy as before.

 

Table of thermal readings:

Idle Load
Before cleaning 70° C 85°-94°
After cleaning 55° C 74°-85° C
Improvement 15° C 9°-11° C

 

The system had been affected by thermal throttling, because the 7-Zip and CPU-Z benchmarks improved.

 

Table of benchmark results:

7-Zip score CPU-Z single CPU-Z multi
Before cleaning 12491 1321 4572
After cleaning 16572 1510 5494
Improvement 32,6% 14,3% 20,2%

 

Conclusion

It can be relevant to clean a computers cooling system.

It may improve noise levels, temperatures and performance.