Android emulator crashing due to access violation

I recently experienced a situation where Android Studio’s emulator crashed when starting an app that performed basic graphics operations.

Crash reason: EXCEPTION_ACCESS_VIOLATION_WRITE
Crash address: 0x66720000
Assertion: Unknown assertion type 0x00000000
Process uptime: 13 seconds

 

I started up Process Monitor to investigate the issue.

Tried recreating the situation, then stopped capturing events with: File -> Capture Events

Wanted to get an overview of the processes involved with: Tools -> Process Tree…

Right clicked studio64.exe and selected: Add process and children to Include filter

 

Noticed that emulator64-crash-service.exe left a memory dump under:

C:\Users\{user}\.android\breakpad\d16b5760-c4ac-4f06-a3ec-73aa74a70087.dmp

 

Examined the memory dump with WinDbg (x64).

Checked for details about the crash with:

!analyze -v

Part of the result:

FAULTING_IP:
ig9icd64!RegisterProcTableCallback+196cfb
000007fe`c5d9f71b f3410f7f48c0 movdqu xmmword ptr [r8-40h],xmm1

EXCEPTION_RECORD: ffffffffffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 000007fec5d9f71b (ig9icd64!RegisterProcTableCallback+0x0000000000196cfb)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 0000000000000001
Parameter[1]: 000000000ba50000
Attempt to write to address 000000000ba50000

STACK_TEXT:
... : ig9icd64!RegisterProcTableCallback+0x196cfb
... : ig9icd64!RegisterProcTableCallback+0x4a228e
... : ig9icd64!RegisterProcTableCallback+0x4a36b4
... : ig9icd64!RegisterProcTableCallback+0x4a828d
... : ig9icd64!RegisterProcTableCallback+0x4c52c7
... : ig9icd64!RegisterProcTableCallback+0x4c500f
... : ig9icd64!RegisterProcTableCallback+0x2ee1a2
... : lib64GLES_V2_translator!glReadPixels+0x297
... : lib64OpenglRender+0x364d
... : lib64OpenglRender+0x773b
... : lib64OpenglRender+0x11009
... : lib64OpenglRender!initLibrary+0x52e3
... : lib64OpenglRender!initLibrary+0xc4e2c
... : lib64OpenglRender!initLibrary+0x4c7a4
... : kernel32!BaseThreadInitThunk+0xd
... : ntdll!RtlUserThreadStart+0x1d

 

This indicated that the problem was related to graphics drivers and/or settings.

I did not have important data on the Android emulator, so I made it boot again by removing all data (including the auto-starting app, which crashed the emulator).

Did this in Android Virtual Device manager with the Wipe Data option.

 

When the emulator could start again I started looking for additional graphics settings.

Found some under Extended controls -> Settings section -> Advanced tab.

The default setting for OpenGL ES renderer (Autodetect based on host) seemed problematic on this system.

I experimented with the settings and discovered that SwiftShader worked well. Running graphics rendering apps with this setting no longer crashed the emulator.

The problem occurred on a desktop system with an Intel Skylake CPU (Core i7-6700) running Windows 7 using Intel HD Graphics 530 with driver version 21.20.16.4860.

Conclusion

If Android Studio’s emulator crashes when running graphics rendering apps, it may help changing the OpenGL ES renderer setting.

Wireshark can hang when stopping capture

I have experienced problems with recent Wireshark versions on Windows, including the current latest version 2.2.3.

 

The way I normally use Wireshark is to capture the traffic of interest, then stop the capture and finally analyze the traffic.

However Wireshark on Windows can hang when stopping the capture…

 

When it happens the Wireshark UI becomes non-responsive:

 

The main thread runs at 100% CPU utilization.

 

And Process Monitor shows that the Wireshark process reads the “End Of File” from the same temporary file over and over:

 

Started debugging the issue by taking multiple memory dumps of the wireshark.exe process using procdump with:

procdump -ma -n 10 -s 1 wireshark.exe

 

Opened the first memory dump in WinDbg (x64).

Checked the call stack for the main thread (0) with:

~0 kp

Result:

# Child-SP          RetAddr           Call Site
00 00000039`a00f9fc8 00007ffc`e5dbc264 ntdll!NtReadFile+0x14
01 00000039`a00f9fd0 00007ffc`d9ca9aa7 KERNELBASE!ReadFile+0x74
02 00000039`a00fa050 00007ffc`d9ca9782 msvcr120!_read_nolock(int fh = 0n8, void * inputbuf = 0x000001a8`3e57bd20, unsigned int cnt = 0x1000)+0x2cf [f:\dd\vctools\crt\crtw32\lowio\read.c @ 256]
03 00000039`a00fa0f0 00007ffc`a14f09b3 msvcr120!_read(int fh = 0n8, void * buf = 0x000001a8`3e57bd20, unsigned int cnt = 0x1000)+0xc6 [f:\dd\vctools\crt\crtw32\lowio\read.c @ 92]
04 00000039`a00fa140 00007ffc`a14efde4 libwiretap!file_tell+0xc93
05 00000039`a00fa170 00007ffc`a14ef98c libwiretap!file_tell+0xc4
06 00000039`a00fa1a0 00007ffc`a1517c8d libwiretap!file_read+0xac
07 00000039`a00fa1e0 00007ffc`a150abb3 libwiretap!wtap_read_bytes_or_eof+0x2d
08 00000039`a00fa210 00007ffc`a150a9a9 libwiretap!wtap_wtap_encap_to_pcap_encap+0xbd3
09 00000039`a00fa290 00007ffc`a1517b77 libwiretap!wtap_wtap_encap_to_pcap_encap+0x9c9
0a 00000039`a00fa310 00007ff7`c7929fd3 libwiretap!wtap_read+0x37
0b 00000039`a00fa350 00007ff7`c7b7cfd7 Wireshark+0x9fd3
0c 00000039`a00fa3a0 00007ff7`c7b963dd Wireshark+0x25cfd7
0d 00000039`a00fa3e0 00007ff7`c798889a Wireshark+0x2763dd
0e 00000039`a00fb470 00007ff7`c79d5144 Wireshark+0x6889a
0f 00000039`a00fb4c0 00000000`5b94f906 Wireshark+0xb5144
10 00000039`a00fb5c0 00000000`5b9c4d66 Qt5Core!QMetaObject::activate+0x5a6
11 00000039`a00fb6d0 00000000`5b95413a Qt5Core!QTimer::timeout+0x16
12 00000039`a00fb700 00000000`5bcf7d12 Qt5Core!QObject::event+0x6a
13 00000039`a00fb8b0 00000000`5bcf6c2f Qt5Widgets!QApplicationPrivate::notify_helper+0x112
14 00000039`a00fb8e0 00000000`5b92f689 Qt5Widgets!QApplication::notify+0x1b3f
15 00000039`a00fc000 00000000`5b977a8c Qt5Core!QCoreApplication::notifyInternal2+0xb9
16 00000039`a00fc080 00000000`5b976a32 Qt5Core!QEventDispatcherWin32Private::sendTimerEvent+0x10c
17 00000039`a00fc0e0 00007ffc`e6851c24 Qt5Core!QEventDispatcherWin32::processEvents+0xd82
18 00000039`a00fc1f0 00007ffc`e685156c user32!UserCallWinProcCheckWow+0x274
19 00000039`a00fc350 00000000`5b9761d9 user32!DispatchMessageWorker+0x1ac
1a 00000039`a00fc3d0 00007ffc`a14029b9 Qt5Core!QEventDispatcherWin32::processEvents+0x529
1b 00000039`a00ff760 00000000`5b92bf91 qwindows!qt_plugin_query_metadata+0x2499
1c 00000039`a00ff790 00000000`5b92e477 Qt5Core!QEventLoop::exec+0x1b1
1d 00000039`a00ff810 00007ff7`c7929ccd Qt5Core!QCoreApplication::exec+0x147
1e 00000039`a00ff880 00007ff7`c7ba2ac5 Wireshark+0x9ccd
1f 00000039`a00ffd50 00007ff7`c7ba22fd Wireshark+0x282ac5
20 00000039`a00ffde0 00007ffc`e87c8364 Wireshark+0x2822fd
21 00000039`a00ffe20 00007ffc`e8f470d1 kernel32!BaseThreadInitThunk+0x14
22 00000039`a00ffe50 00000000`00000000 ntdll!RtlUserThreadStart+0x21

 

Checked the main thread (0) call stack for all the memory dumps by scripting CDB, the console version of WinDbg.

Used this PowerShell script:

$dmpPath = 'C:\Bin\Procdump\Wireshark\'
$dmpFiles = Get-ChildItem -Path $dmpPath -Recurse -Include *.dmp

foreach ($dmpFile in $dmpFiles)
{
    & "C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\cdb.exe" -z $dmpFile -c "~0 kp; Q"
}

 

Noticed that all the memory dumps had the same call stack for thread 0, which further indicated that WireShark was stuck or running in an endless loop.

 

Now I wanted to identify if this was a hang or endless loop.

Attached to the running, but non-responsive Wireshark process with WinDbg (x64).

Experimented by setting breakpoints from the initial call stack.

 

Set a single breakpoint at:

MSVCR120!_read_nolock(...)+0x2cf

With:

bc *
bp MSVCR120!_read_nolock+0x2cf

Result after continue: Breakpoint was continually hit.

 

This definitely indicated an endless loop.

 

Set a single breakpoint at:

libwiretap!wtap_read_bytes_or_eof+0x2d

With:

bc *
bp libwiretap!wtap_read_bytes_or_eof+0x2d

Result after continue: Breakpoint was continually hit.

 

Set a single breakpoint at:

Qt5Core!QTimer::timeout+0x16

With:

bc *
bp Qt5Core!QTimer::timeout+0x16

Result after continue: Breakpoint was not hit (within 1 minute)

 

Set single breakpoints at the various Wireshark functions.

The breakpoint was not hit until setting a single breakpoint at:

Wireshark+0x9fd3

With:

bc *
bp Wireshark+0x9fd3

Result after continue: The breakpoint was continually hit.

 

This indicated that the endless loop occured in the Wireshark module.

 

Had a quick look at the disassembly and tried single stepping with:

p

The code was definitely looping, but the problem was not obvious by looking at the machine code.

 

Went looking for debug symbols and found them here:

https://1.eu.dl.wireshark.org/win64/all-versions/Wireshark-pdb-win64-2.2.3.zip

Downloaded the debug symbols and unpacked them to a temporary folder.

 

Then modified the symbol path in WinDbg with:

.sympath C:\Temp\WSSymbols\;srv*c:\SymbolsCache*https://msdl.microsoft.com/download/symbols

And reloaded the symbols with:

.reload /f

 

Checked the call stack again with:

~0 kp

Result:

# Child-SP          RetAddr           Call Site
00 00000039`a00f9fc8 00007ffc`e5dbc264 ntdll!NtReadFile+0x14
01 00000039`a00f9fd0 00007ffc`d9ca9aa7 KERNELBASE!ReadFile+0x74
02 00000039`a00fa050 00007ffc`d9ca9782 MSVCR120!_read_nolock(int fh = 0n8, void * inputbuf = 0x000001a8`3e57bd20, unsigned int cnt = 0x1000)+0x2cf [f:\dd\vctools\crt\crtw32\lowio\read.c @ 256]
03 00000039`a00fa0f0 00007ffc`a14f09b3 MSVCR120!_read(int fh = 0n8, void * buf = 0x000001a8`3e57bd20, unsigned int cnt = 0x1000)+0xc6 [f:\dd\vctools\crt\crtw32\lowio\read.c @ 92]
04 00000039`a00fa140 00007ffc`a14efde4 libwiretap!raw_read(struct wtap_reader * state = 0x000001a8`4ad0ebd0, unsigned char * buf = 0x000001a8`3e57bd20 "vided by dumpcap???", unsigned int count = 0x1000, unsigned int * have = 0x000001a8`4ad0ec08)+0x43 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\file_wrappers.c @ 133]
05 00000039`a00fa170 00007ffc`a14ef98c libwiretap!fill_out_buffer(struct wtap_reader * state = 0x000001a8`4ad0ebd0)+0x44 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\file_wrappers.c @ 704]
06 00000039`a00fa1a0 00007ffc`a1517c8d libwiretap!file_read(void * buf = 0x00000039`a00fa250, unsigned int len = 8, struct wtap_reader * file = 0x000001a8`4ad0ebd0)+0xac [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\file_wrappers.c @ 1237]
07 00000039`a00fa1e0 00007ffc`a150abb3 libwiretap!wtap_read_bytes_or_eof(struct wtap_reader * fh = 0x000001a8`4ad0ebd0, void * buf = <Value unavailable error>, unsigned int count = 8, int * err = 0x00000039`a00fa3a0, char ** err_info = 0x00000039`a00fa3b0)+0x2d [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\wtap.c @ 1291]
08 00000039`a00fa210 00007ffc`a150a9a9 libwiretap!pcapng_read_block(struct wtap * wth = 0x000001a8`41f48790, struct wtap_reader * fh = 0x000001a8`4ad0ebd0, struct pcapng_t * pn = 0x000001a8`41f1ca40, struct wtapng_block_s * wblock = 0x00000039`a00fa2c0, int * err = 0x00000039`a00fa3a0, char ** err_info = 0x00000039`a00fa3b0)+0x53 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\pcapng.c @ 2303]
09 00000039`a00fa290 00007ffc`a1517b77 libwiretap!pcapng_read(struct wtap * wth = 0x000001a8`41f48790, int * err = 0x00000039`a00fa3a0, char ** err_info = 0x00000039`a00fa3b0, int64 * data_offset = 0x00000039`a00fa3a8)+0x79 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\pcapng.c @ 2589]
0a 00000039`a00fa310 00007ff7`c7929fd3 libwiretap!wtap_read(struct wtap * wth = 0x000001a8`41f48790, int * err = 0x00000039`a00fa3a0, char ** err_info = 0x00000039`a00fa3b0, int64 * data_offset = <Value unavailable error>)+0x37 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wiretap\wtap.c @ 1237]
0b 00000039`a00fa350 00007ff7`c7b7cfd7 Wireshark!capture_info_new_packets(int to_read = 0n14, struct _info_data * cap_info = 0x000001a8`3f1838b8)+0x43 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\capture_info.c @ 211]
0c 00000039`a00fa3a0 00007ff7`c7b963dd Wireshark!capture_input_new_packets(struct _capture_session * cap_session = 0x000001a8`3f183878, int to_read = 0n23)+0xa7 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\ui\capture.c @ 407]
0d 00000039`a00fa3e0 00007ff7`c798889a Wireshark!sync_pipe_input_cb(int source = 0n4, void * user_data = 0x000001a8`3f183878)+0x1bd [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\capchild\capture_sync.c @ 1775]
0e 00000039`a00fb470 00007ff7`c79d5144 Wireshark!MainWindow::pipeTimeout(void)+0x8a [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\ui\qt\main_window_slots.cpp @ 949]
0f 00000039`a00fb4c0 00000000`5b94f906 Wireshark!MainWindow::qt_static_metacall(class QObject * _o = 0x000001a8`3f1836e0, QMetaObject::Call _c = <Value unavailable error>, int _id = <Value unavailable error>, void ** _a = 0x00000039`a00fb5f8)+0x4c4 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\cmbuild\ui\qt\moc_main_window.cpp @ 1436]
10 00000039`a00fb5c0 00000000`5b9c4d66 Qt5Core!QMetaObject::activate+0x5a6
11 00000039`a00fb6d0 00000000`5b95413a Qt5Core!QTimer::timeout+0x16
12 00000039`a00fb700 00000000`5bcf7d12 Qt5Core!QObject::event+0x6a
13 00000039`a00fb8b0 00000000`5bcf6c2f Qt5Widgets!QApplicationPrivate::notify_helper+0x112
14 00000039`a00fb8e0 00000000`5b92f689 Qt5Widgets!QApplication::notify+0x1b3f
15 00000039`a00fc000 00000000`5b977a8c Qt5Core!QCoreApplication::notifyInternal2+0xb9
16 00000039`a00fc080 00000000`5b976a32 Qt5Core!QEventDispatcherWin32Private::sendTimerEvent+0x10c
17 00000039`a00fc0e0 00007ffc`e6851c24 Qt5Core!QEventDispatcherWin32::processEvents+0xd82
18 00000039`a00fc1f0 00007ffc`e685156c USER32!UserCallWinProcCheckWow+0x274
19 00000039`a00fc350 00000000`5b9761d9 USER32!DispatchMessageWorker+0x1ac
1a 00000039`a00fc3d0 00007ffc`a14029b9 Qt5Core!QEventDispatcherWin32::processEvents+0x529
1b 00000039`a00ff760 00000000`5b92bf91 qwindows!qt_plugin_query_metadata+0x2499
1c 00000039`a00ff790 00000000`5b92e477 Qt5Core!QEventLoop::exec+0x1b1
1d 00000039`a00ff810 00007ff7`c7929ccd Qt5Core!QCoreApplication::exec+0x147
1e 00000039`a00ff880 00007ff7`c7ba2ac5 Wireshark!main(int argc = 0n1, char ** qt_argv = <Value unavailable error>)+0xe3d [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\wireshark-qt.cpp @ 853]
1f 00000039`a00ffd50 00007ff7`c7ba22fd Wireshark!WinMain+0x155
20 00000039`a00ffde0 00007ffc`e87c8364 Wireshark!__tmainCRTStartup(void)+0x149 [f:\dd\vctools\crt\crtw32\dllstuff\crtexe.c @ 618]
21 00000039`a00ffe20 00007ffc`e8f470d1 KERNEL32!BaseThreadInitThunk+0x14
22 00000039`a00ffe50 00000000`00000000 ntdll!RtlUserThreadStart+0x21

 

The symbols seemed to be correct and functional.

 

Wanted to verify the location of the endless loop by setting breakpoints again.

 

Set a single breakpoint at:

Wireshark!capture_input_new_packets

With:

bc *
bp Wireshark!capture_input_new_packets

Result after continue: Breakpoint was not hit (within 1 minute)

 

Set a single breakpoint at:

Wireshark!capture_info_new_packets

With:

bc *
bp Wireshark!capture_info_new_packets

Result after continue: Breakpoint was not hit (within 1 minute)

 

Set a single breakpoint at:

libwiretap!wtap_read

With:

bc *
bp libwiretap!wtap_read

Result after continue: The breakpoint was continually hit.

 

This result indicated that the endless loop was in Wireshark!capture_info_new_packets(…) calling libwiretap!wtap_read(…), specifically here:

Wireshark!capture_info_new_packets(int to_read = 0n14, struct _info_data * cap_info = 0x000001a8`3f1838b8)+0x43 [c:\buildbot\wireshark\wireshark-2.2-64\windows-2012r2-x64\build\capture_info.c @ 211]

 

Downloaded the source code for Wireshark from:

https://1.eu.dl.wireshark.org/src/wireshark-2.2.3.tar.bz2

 

Looked for and opened: capture_info.c

Found the method, which has been included here:

/* new packets arrived */
void capture_info_new_packets(int to_read, info_data_t* cap_info)
{
    int err;
    gchar *err_info;
    gint64 data_offset;
    struct wtap_pkthdr *phdr;
    union wtap_pseudo_header *pseudo_header;
    int wtap_linktype;
    const guchar *buf;

    cap_info->ui.new_packets = to_read;

    /*g_warning("new packets: %u", to_read);*/

    while (to_read > 0) {
        wtap_cleareof(cap_info->wtap);
        if (wtap_read(cap_info->wtap, &err, &err_info, &data_offset)) {
            phdr = wtap_phdr(cap_info->wtap);
            pseudo_header = &phdr->pseudo_header;
            wtap_linktype = phdr->pkt_encap;
            buf = wtap_buf_ptr(cap_info->wtap);

            capture_info_packet(cap_info, wtap_linktype, buf, phdr->caplen, pseudo_header);

            /*g_warning("new packet");*/
            to_read--;
        }
    }

    capture_info_ui_update(&cap_info->ui);
}

 

I have marked some points of interest with bold.

 

The while loop continues as long as to_read is greater than 0.

Noticed that the to_read variable continued to have the value 14.

to_read is only decremented in case wtap_read returns a non-zero value.

It seems that this never happens, when the endless loop occurs in Wireshark.

 

I have noticed that the problem apparently doesn’t occur with Wireshark Legacy based on GTK (at least with default settings).

I wanted to determine the cause for the difference.

 

Examined symbols for Wireshark (QT) with:

x /D /f Wireshark!capture_info_new_packets

Result:

00007ff7`c7929f90 Wireshark!capture_info_new_packets (int, struct _info_data *)

 

Examined symbols for Wireshark Legacy (GTK) with:

x /D /f Wireshark_gtk!capture_info_new_packets

Result:

00007ff7`c03310f0 Wireshark_gtk!capture_info_new_packets (int, struct _info_data *)

 

Both versions include the function capture_info_new_packets (…)

 

Checked where the function was mentioned in the source code with grep (from Cygwin):

grep "capture_info_new_packets" -r

Result:

capture_info.c:void capture_info_new_packets(int to_read, info_data_t* cap_info)
capture_info.h:extern void capture_info_new_packets(int to_read, info_data_t* cap_info);
ui/capture.c:    capture_info_new_packets(to_read, cap_session->cap_data_info);

 

Checked the content of ui/capture.c and found:

if(capture_opts->show_info)
    capture_info_new_packets(to_read, cap_session->cap_data_info);

 

So the capture_info_new_packets(…) function will only be called if the show_info option is true…

 

Looked for places where show_info was used with:

grep "show_info" -r

Part of the result was:

ui/gtk/capture_dlg.c:  gtk_toggle_button_set_active(GTK_TOGGLE_BUTTON(hide_info_cb), !global_capture_opts.show_info);
ui/gtk/capture_dlg.c:  global_capture_opts.show_info =
ui/qt/capture_interfaces_dialog.cpp:    global_capture_opts.show_info = checked;
ui/qt/capture_interfaces_dialog.cpp:    ui->cbExtraCaptureInfo->setChecked(global_capture_opts.show_info);

 

For Wireshark Legacy (GTK) I checked the content of ui/gtk/capture_dlg.c and found:

global_capture_opts.show_info =
    !gtk_toggle_button_get_active(GTK_TOGGLE_BUTTON(hide_info_cb));

 

For Wireshark Legacy (GTK) the endless loop problem can be avoided by enabling this option in the Capture Options dialog :

Hide capture info dialog   (default setting)

 

For Wireshark (QT) I checked the content of ui/qt/capture_interfaces_dialog.cpp and found:

void CaptureInterfacesDialog::on_cbExtraCaptureInfo_toggled(bool checked)
{
    global_capture_opts.show_info = checked;
}

 

For Wireshark (QT) the endless loop problem can be avoided by disabling this option in the Capture Interfaces dialog :

Show extra capture information dialog   (not default setting)

 

Conclusion

The endless loop problem occurs consistently depending on the identified settings, so the problem cause has likely been found along with a workaround.

Until the endless loop bug has been fixed in Wireshark, it is recommended to set the Wireshark options like described above.

Debugging endless loop caused by missing SignalR performance counters

I recently experienced a problem with an IIS website:

After changing and building code, the site would hang indefinitely at startup.

It seemed like an endless loop, because CPU activity was high.

 

One workaround was to reset IIS with:

iisreset

Then the site became functional until the next change + build.

 

However I decided to find the cause of the problem and a real solution.

 

I checked the process activity for w3wp.exe with Process Monitor:

These parts of the registry were continually accessed:

process_monitor_w3wp_registry_signalr_performance

HKLM
HKLM\SYSTEM\CurrentControlSet\Services\SignalR\Performance
HKLM\System\CurrentControlSet\Services\SignalR\Performance

 

At this point I should have checked if the system actually had SignalR performance counters installed…

 

However I decided to examine the situation further by attaching WinDbg (x64) to the w3wp.exe process.

 

Because I was suspecting an endless loop I decided to check the thread CPU activity with:

!runaway

Part of the result was:

User Mode Time
Thread       Time
6:26cc      0 days 0:05:39.750
15:2e00      0 days 0:00:11.187
13:27b0      0 days 0:00:06.718

 

Thread 6 was definitely the one with the most activity, so I switched to that with:

~6s

 

Then checked the .NET call stack with:

!clrstack

Part of the result was:

OS Thread Id: 0x26cc (6)
Child SP               IP Call Site
000000f1a0c6b158 00007ff98b60b1a6 [HelperMethodFrame: 000000f1a0c6b158] System.Security.SecurityRuntime.GetSecurityObjectForFrame(System.Threading.StackCrawlMark ByRef, Boolean)
000000f1a0c6b260 00007ff98a4b584c System.Security.SecurityRuntime.RevertAssert(System.Threading.StackCrawlMark ByRef)
000000f1a0c6b290 00007ff98a4b9e3d System.Security.PermissionSet.RevertAssert()
000000f1a0c6b2c0 00007ff989cd5e6d System.Diagnostics.PerformanceCounterLib.FindCustomCategory(System.String, System.Diagnostics.PerformanceCounterCategoryType ByRef)
000000f1a0c6b310 00007ff989cd5db8 System.Diagnostics.PerformanceCounterLib.FindCustomCategory(System.String, System.Diagnostics.PerformanceCounterCategoryType ByRef)
000000f1a0c6b380 00007ff989cd735c System.Diagnostics.PerformanceCounterLib.IsCustomCategory(System.String, System.String)
000000f1a0c6b3e0 00007ff989cdd1af System.Diagnostics.PerformanceCounter.InitializeImpl()
000000f1a0c6b450 00007ff989cdc35f System.Diagnostics.PerformanceCounter..ctor(System.String, System.String, System.String, Boolean)
000000f1a0c6b4a0 00007ff92f81b7b5 Microsoft.AspNet.SignalR.Infrastructure.PerformanceCounterManager.LoadCounter(System.String, System.String, System.String, Boolean)
000000f1a0c6b550 00007ff92f81b606 Microsoft.AspNet.SignalR.Infrastructure.PerformanceCounterManager.SetCounterProperties()
000000f1a0c6b5c0 00007ff92f81b2d7 Microsoft.AspNet.SignalR.Infrastructure.PerformanceCounterManager.Initialize(System.String, System.Threading.CancellationToken)
000000f1a0c6b640 00007ff92f81a23b Microsoft.AspNet.SignalR.Hosting.HostDependencyResolverExtensions.InitializeHost(Microsoft.AspNet.SignalR.IDependencyResolver, System.String, System.Threading.CancellationToken)
000000f1a0c6b680 00007ff92f81859b Owin.OwinExtensions.UseSignalRMiddleware[[System.__Canon, mscorlib]](Owin.IAppBuilder, System.Object[])
000000f1a0c6b770 00007ff92f818147 Owin.OwinExtensions+c__DisplayClass1.b__0(Owin.IAppBuilder)
000000f1a0c6b7b0 00007ff92f816c22 Owin.MapExtensions.Map(Owin.IAppBuilder, Microsoft.Owin.PathString, System.Action`1)
000000f1a0c6b800 00007ff92f816993 Owin.MapExtensions.Map(Owin.IAppBuilder, System.String, System.Action`1)
000000f1a0c6b850 00007ff92f816814 Owin.OwinExtensions.MapSignalR(Owin.IAppBuilder)

 

After continuing and breaking execution a few times I was seeing similar call stacks.

I needed to identify the endless loop.

 

Experimented by setting breakpoints from the above call stack.

 

Set a single breakpoint at:

00007ff989cd735c System.Diagnostics.PerformanceCounterLib.IsCustomCategory(System.String, System.String)

With:

bc *
bp 00007ff989cd735c

Result: Breakpoint was continually hit.

 

Set a single breakpoint at:

00007ff989cdd1af System.Diagnostics.PerformanceCounter.InitializeImpl()

With:

bc *
bp 00007ff989cdd1af

Result: breakpoint was not hit within 1 minute.

 

This indicated that the endless loop was in:

System.Diagnostics.PerformanceCounterLib.IsCustomCategory(System.String, System.String)

 

I used ILSpy to decompile the method and noticed this loop:

for (CultureInfo cultureInfo = CultureInfo.CurrentCulture; cultureInfo != CultureInfo.InvariantCulture; cultureInfo = cultureInfo.Parent)
{
    performanceCounterLib = PerformanceCounterLib.GetPerformanceCounterLib(machine, cultureInfo);
    if (performanceCounterLib.IsCustomCategory(category))
    {
        return true;
    }
}

 

I noted that the loop condition likely continued to be true.

And that performanceCounterLib.IsCustomCategory(category) most likely continued to be false.

This was likely the endless loop I was looking for.

 

I set an additional breakpoint with:

!bpmd   System.dll   System.Diagnostics.PerformanceCounterLib.IsCustomCategory

Messages from WinDbg:

Found 2 methods in module 00007ff989421000...
MethodDesc = 00007ff989524020
MethodDesc = 00007ff989524040
Setting breakpoint: bp 00007FF989CD73EB [System.Diagnostics.PerformanceCounterLib.IsCustomCategory(System.String)]
Setting breakpoint: bp 00007FF989CD727A [System.Diagnostics.PerformanceCounterLib.IsCustomCategory(System.String, System.String)]
Adding pending breakpoints...

 

However when continuing only breakpoint 0 was hit, further indicating that the endless loop occured inside PerformanceCounterLib.IsCustomCategory(…).

 

Decided to look for the CultureInfo object in the loop.

 

Checked the call stack including locals and parameters with:

clrstack -a

Top part of the call stack was:

OS Thread Id: 0x26cc (6)
Child SP               IP Call Site
000000f1a0c6b380 00007ff989cd735c System.Diagnostics.PerformanceCounterLib.IsCustomCategory(System.String, System.String)
PARAMETERS:
machine (<CLR reg>) = 0x000000f4a155df18
category (<CLR reg>) = 0x000000f4a1584228
LOCALS:
<CLR reg> = 0x000000f1a158f130

 

Examined the local reference with:

!DumpObj /d 000000f1a158f130

Part of the result was:

Name:        System.Globalization.CultureInfo
MethodTable: 00007ff98a701420
EEClass:     00007ff98a111e48
Size:        128(0x80) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
Fields:
MT    Field   Offset                 Type VT     Attr            Value Name
00007ff98a6ed6f8  4001208       74       System.Boolean  1 instance                1 m_isReadOnly
00007ff98a6e0698  4001209        8 ...ation.CompareInfo  0 instance 000000f1a1593b40 compareInfo
...
...
00007ff98a6ed6f8  4001212       75       System.Boolean  1 instance                0 m_isInherited
00007ff98a6ed6f8  4001213       76       System.Boolean  1 instance                1 m_isSafeCrossDomain
00007ff98a7003d0  4001214       70         System.Int32  1 instance                2 m_createdDomainID
00007ff98a701420  4001215       40 ...ation.CultureInfo  0 instance 0000000000000000 m_consoleFallbackCulture
00007ff98a6fda88  4001216       48        System.String  0 instance 000000f1a1551420 m_name
00007ff98a6fda88  4001217       50        System.String  0 instance 000000f1a1551420 m_nonSortName
00007ff98a6fda88  4001218       58        System.String  0 instance 000000f1a1551420 m_sortName
00007ff98a701420  4001223       60 ...ation.CultureInfo  0 instance 000000f1a158f130 m_parent
...
...

 

Noticed that m_parent pointed to itself.

 

Cleared all breakpoints with:

bc *

Set another breakpoint with:

!bpmd   System.dll   System.Diagnostics.PerformanceCounterLib.FindCustomCategory

 

This breakpoint was continually hit as expected.

 

Then checked the PerformanceCounterLib.FindCustomCategory(…) method with ILSpy.

Noticed that the method accessed the registry and that the return value depended on what was found.

 

Searched online for: HKLM\System\CurrentControlSet\Services\SignalR\Performance

And found this article: http://www.asp.net/signalr/overview/performance/signalr-performance

 

Based on the article I decided to install SignalR performance counters by:

1. Installing the NuGet package: signalr.utils

2. Then installing the SignalR performance counters with:

SignalR.exe ipc

 

This was the solution!

After continuing execution, the loop ended and site initialization continued normally.

(The issue occured with version 2.2.1 of SignalR and version 4.6.1 of .NET, but targeting version 4.5)

Conclusion

A site using SignalR can hang with an endless loop when initializing, if the SignalR performance counters are not installed.

A combination of tools like Process Monitor, ILSpy and WinDbg can be used to discover the cause of an endless loop in 3rd party or framework code.

Get .NET call stack when files are written with WinDbg

I was interested in learning how Neuron ESB performed logging internally.

(To see if it would be relevant to use the same classes and methods for logging in custom C# process steps and possibly other situations)

 

At first I used Process Monitor.

I found a WriteFile event, where a log file was updated.

Neuron_ESB_WriteFile_Process_Monitor_Event

I checked the stack under event properties.

Neuron_ESB_WriteFile_Process_Monitor_Event_Properties_Stack

The native stack trace didn’t tell me which .NET classes or methods that were in use, but it did give me an idea:

Perhaps I could use WinDbg to get the .NET call stack for each call to WriteFile in kernel32.dll

 

Indeed this was possible by:

1. Starting WinDbg (x64) as Administrator.

2. Attaching to the ESBService.exe process.

3. Loading the SOS extension with:

.loadby sos clr

4. Setting a breakpoint with a command to show the .NET call stack and continue execution with:

bp kernel32!WriteFile "!clrstack; g"

5. Continuing execution with:

g

 

This showed the .NET call stack every time files were written.

Most WriteFile calls were for logging (as expected).

An example:

OS Thread Id: 0x2354 (35)
Child SP               IP Call Site
000000001836e028 00000000775a1fa0 [InlinedCallFrame: 000000001836e028] Microsoft.Win32.Win32Native.WriteFile(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
000000001836e028 000007feea284cf4 [InlinedCallFrame: 000000001836e028] Microsoft.Win32.Win32Native.WriteFile(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
000000001836dff0 000007feea284cf4 DomainNeutralILStubClass.IL_STUB_PInvoke(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
000000001836e0d0 000007feea265984 System.IO.FileStream.WriteFileNative(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte[], Int32, Int32, System.Threading.NativeOverlapped*, Int32 ByRef)
000000001836e130 000007feea2658d2 System.IO.FileStream.WriteCore(Byte[], Int32, Int32)
000000001836e1a0 000007feea265847 System.IO.FileStream.FlushInternalBuffer()
000000001836e1e0 000007feea2657d4 System.IO.FileStream.Flush(Boolean)
000000001836e220 000007feea27128c System.IO.StreamWriter.Flush(Boolean, Boolean)
000000001836e280 000007fec064f2f7 Microsoft.VisualBasic.Logging.FileLogTraceListener+ReferencedStream.Flush()
000000001836e2d0 000007fec064e065 Microsoft.VisualBasic.Logging.FileLogTraceListener.Flush()
000000001836e300 000007fee92e0859 System.Diagnostics.TraceInternal.WriteLine(System.String)
000000001836e370 000007fe8bcc780d Neuron.Esb.ESBMasterService.Trace(System.Diagnostics.TraceListener, System.String)
000000001836e3c0 000007fe8bcc7bec Neuron.Esb.ESBMasterService.TraceInfo(System.Diagnostics.TraceSwitch, System.String, System.Diagnostics.TraceListener)
000000001836e420 000007fe8bfa72a9 Neuron.Esb.ESBMasterService.TraceInfo(System.Diagnostics.TraceSwitch, System.String)
000000001836e460 000007fe8c1c831e Neuron.Esb.EsbService.AmqpService.Trace(System.String, System.Object[])
000000001836e4a0 000007fe8cbbbc2e Neuron.Esb.EsbService.AmqpService.UpdateServiceStatus(Neuron.Esb.EsbService.ServiceState, Newtonsoft.Json.Linq.JToken)
000000001836e530 000007fe8c8baa0b Neuron.Esb.EsbService.AmqpService.UpdateExchangeStatus(System.Object, System.Timers.ElapsedEventArgs)
000000001836e5d0 000007fee9802586 System.Timers.Timer.MyTimerCallback(System.Object)
000000001836e650 000007feea1c2490 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
000000001836e720 000007feea1c2327 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
000000001836e750 000007feea1f0b9b System.Threading.TimerQueueTimer.CallCallback()
000000001836e7b0 000007feea1f0998 System.Threading.TimerQueueTimer.Fire()
000000001836e800 000007feea281a9f System.Threading.TimerQueue.FireNextTimers()
000000001836ed08 000007feeb303753 [DebuggerU2MCatchHandlerFrame: 000000001836ed08]
000000001836ee98 000007feeb303753 [ContextTransitionFrame: 000000001836ee98]
000000001836f0b8 000007feeb303753 [DebuggerU2MCatchHandlerFrame: 000000001836f0b8]

 

Having years of experience with log4net, I decided to use and recommend that.

Avoid storing multiple versions of custom adapters in Neuron ESB search path

While testing a custom adapter for Neuron ESB I encountered this error:

Neuron_could_not_load_type

System.TypeLoadException: Could not load type 'Neuron.Esb.Adapters.AdapterEncodingTypeConverter' from assembly 'Neuron.Esb.Adapters.FlatFileAdapter, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null'.

This seemed pretty odd because namespace and name were correct, at least in the DLL stored under:

C:\Program Files\Neudesic\Neuron ESB v3\DEFAULT\Adapters\

Troubleshooting

Using Process Monitor I discovered that I had mistakenly copied an earlier version of Neuron.Esb.Adapters.FlatFileAdapter.dll to:

C:\Program Files\Neudesic\Neuron ESB v3\DEFAULT\

NeuronExplorer_accessing_wrong_adapter_version

(This version actually didn’t have the type/class needed, but Neuron Explorer seemed to prefer using it…)

The problem was resolved by:

1. Shutting down NeuronExplorer.exe and ESBService.exe (because they lock assemblies in use while running)

2. Removing the old and unwanted DLL.

3. Starting ESBService.exe and NeuronExplorer.exe

Conclusion

Avoid storing multiple versions of custom adapters in Neuron ESB’s search path to avoid potential problems like this.