Despite all the functional testing and stress testing you do prior to releasing your
BizTalk app into production, unexpected behavior can (and often will) happen just
the same. Production usage just winds up introducing all sorts of permutations
(including interactions with external systems) that are hard to predict earlier in
the lifecycle.
The goal, of course, is to minimize the the operational “care and feeding” that an
application requires over time. Making this happen is mostly a function of using
the application’s “diagnostic surface area” (logs, counters, MOM packs, etc.) to feed
back into each release cycle. But we also need post-mortem tools when the host
environment terminates unexpectedly or stops responding (whether that environment
is BizTalk, IIS, COM+, Sql SSIS, etc.)
While a well-designed app will be able to successfully restart and resume processing
(with full data integrity) at such a point (i.e. after the host has been terminated),
there is still operational expense that has been injected. We want to find and
eliminate these problems…
Using the Visual Studio debugger is almost never an option in production, of course.
We need the ability to capture the current state as a “dump file” and do offline analysis.
The “Windows
Debugging Tools” are designed for this purpose (and you will often use these during
a call with Microsoft’s support staff, so it is good to be familiar with them.)
The debugging tools are a pretty large subject – so here, we are just going to cover
the bare minimum required to capture a dump file for your running BizTalk process
when it appears to be hung with a large number of “Active” service instances.
Step By Step:
-
Install or xcopy the Windows
Debugging Tools to the server where BizTalk is currently hung (or crashing unexpectedly.)
It can be helpful to install in an easy location for command line access like ‘c:\debuggers’. -
From command line, run the following from the command line to get process IDs for
all BizTalk hosts:
typeperf “\BizTalk:Messaging(*)\ID Process” -sc 1 -
Run ‘adplus.vbs’ in crash or hang mode, depending on whether the process ends unexpectedly
(crash) or has become unresponsive (hang). To generate a hang dump, your command
line might look like:
c:\debuggers\cscript adplus.vbs –hang –CTCF –p (pid from last step)
–o c:\temp - Copy the dump file to an offline location if need be.
-
Set an envrionment variable called ‘_NT_SYMBOL_PATH’ to ‘srv*c:\symbols*http://msdl.microsoft.com/download/symbols’.
Alternatively, launch WinDbg.exe from the debuggers directory and use the File-‘Symbol
File Path’ menu. This will ensure that you are automatically downloading the
correct symbols when you analyze the crash dump. -
Start WinDbg.exe, and use File-‘Open Crash Dump’ to open your dump file. Then,
in the command window, use:
‘.load C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\sos.dll’ to load managed code
debugging extensions. -
In the command window, use !EEStack to get a full stack trace. Use Edit-Find
to search for your custom code method name or the name of your orchestration.
Look for patterns that indicate the cause of the hang (“hmmm, all my threads seem
to be inside Thread.Sleep. That’s funny.”) Use !help from the command window
to begin learning about the rest of SOS (to assist with diagnosing managed memory
leaks, etc.)
For more information on the Windows debugging tools, see here.