I worked through a problem recently with a client that really took me by surprise
– because I would think that many BizTalk shops would be running into this issue regularly. 
So!  Here goes with an explanation and a solution. 

We ran into this problem initially using the MSMQ (not MSMQT) adapter with BizTalk
2004.  We had roughly 10 MSMQ receive locations, as well as a few send ports
that were using the loopback
adapter.  These were all executing in the same host.

The initial symptom was that the loopback adapter appeared to not work – messages
were just not getting through!  They sat in the “delivered, not consumed state”
for no good reason.  But we quickly reproduced the problem with just MSMQ receive
locations (i.e. without the loopback adapter.)

On a single processor virtual machine, the repro looked like this:  Create four
MSMQ receive locations, and one MSMQ send port (with the send port subscribed
to one of the receive ports, just to keep things easy.)  No messages
will flow through the send port at all.
 

To repro: This
download has a binding file with receive ports/locations for local (non-tx) private
queues Q1-Q4, plus a send port for local private queue NONTXQ with a filter for the
first receive port.  There is also a bit of VB script to put a message into a
queue…If you turn off one of the receive locations for Q2-Q4, you’ll find things
work just fine.  If you don’t, then (reiterating) no messages will flow through
the send port.

What was the resolution?  Well, with BizTalk 2004 Service Pack 1 installed,
you can create a “CLR Hosting” key under the registry service definition.

  • Open the registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services

  • Under BTSSvc{some GUID}, create a key called ‘CLR Hosting’. (Note that there
    will be a BTSSvc3.0 entry present…but you must add the key under a BTSSvc{some
    GUID} key, where the GUID corresponds to the host you are dealing with, as shown by
    the DisplayName value.). Example: BTSSvc{DDEF238B-2D21-4B1E-8845-C6C67C6A86C0}.

  • Under the “CLR Hosting” key (which you will create), create the following DWORD entries
    with the following values:

    • MaxIOThreads – 75 (actual # tbd)

    • MaxWorkerThreads – 75 (actual # tbd)

    • MinIOThreads – 55 (actual # tbd)

    • MinWorkerThreads – 55 (actual # tbd)
  • Restart the BizTalk host service.

In our case, we actually had to increase these values – you should determine the values
you need through testing.  Consider having min worker threads equal to 7x the
number of MSMQ receive locations, and max worker threads equal to 10x the
number of MSMQ receive locations.  (More on these numbers later…)

Does the documentation address this?  Good question.  If you look at
the topic “Managing Multiple Receive Locations” in the MSMQ adapter documentation, you
will find some reference to this.  It indicates you should create a “CLR Hosting”
key as described above…but no actual values are mentioned (clearly just
a documentation mishap.)

But why do these have to be tweaked at all?  Good question.  The
documentation for the MSMQ adapter has some unfortunate quotables, like:

To increase performance, Microsoft BizTalk® 2004 Adapter for MSMQ
is multi-threaded. If you have many receive locations, there may not be enough threads
available for all the receive locations. This prevents some of the receive locations
from picking up messages.

The reality is that you really shouldn’t have to starve any particular receive location
because of a lack of threads…you should just wind up with increased latency. 
But, such is not the implementation of the MSMQ adapter (at least for BizTalk
2004.) 

Some background: The MSMQ adapter has a “Batch Size” parameter and a “Serial
Processing” parameter that can be set per receive location.  “Batch Size” determines
how many messages the adapter will attempt to read from the queue (and submit to the
message box) on each iteration.  “Serial Processing” determines whether one thread
is engaged in the peek/get/submit activity per receive location (Serial Processing
= ‘true’) or multiple threads (Serial Processing = ‘false’).  If “Serial
Processing” is true, the “Batch Size” is forced to one regardless of the actual setting.

So what is the execution flow for a given receive location?  The internal class
MsmqReceiverEndpoint is instantiated per receive location, and when it initializes,
it calls ThreadPool.QueueUserWorkItem with
a reference to itself.   If “Serial Processing” is false…it does this
exactly seven (7) times.  

What does it do with the QueueUserWorkItem callback?  Well, when MsmqReceiverEndpoint.ProcessWorkItem
is called, it enters into a do/while loop that doesn’t exit until the
endpoint (receive location) becomes invalid (i.e. the receive location is shut town.) 
In other words, ProcessWorkItem sits on a .NET thread pool thread – and if Serial
Processing is false, it sits on seven of them.  The do/while loop executes
a peek on the queue (with a hard-coded 10 second timeout), and if there are messages
waiting, it receives up to “Batch Size” and submits them to the message box.  (It
will give up attempting to receive a “Batch Size” worth of messages if the 10 second
timeout is reached on any attempt within the batch receive loop – i.e. if you drop
a single message on a queue, and the batch size is greater than one, expect to wait
10 seconds before further activity begins…)  The behavior of consuming seven
threads per queue leads to the recommendation of MinWorkerThreads = 7x MSMQ receive
locations provided above.

Now, I confess – I’m not a BizTalk adapter expert.  But, this design
seems to be in conflict with the advice offered in “Writing
Effective BizTalk Server Adapters
“, where it says:

Don’t starve the .NET thread pool: …While starving
the .NET thread pool is a risk to all asynchronous programming in .NET, it is particularly
important for the BizTalk Server adapter programmer to watch out for this.  It
has impacted many BizTalk Server adapters: take great care not to starve the .NET
thread pool.  The .NET thread pool is a limited but widely shared resource.  It
is very easy to write code that uses one of its threads and holds onto it for ages
and in so doing blocks other work items from ever being executed….If you have multiple
pieces of work to do (for example copying messages out of MQSeries into BizTalk Server),
you should execute one work item (one batch of messages into BizTalk Server) and simply
requeue in the thread pool if there is more work to do. What ever you do, don’t
sit in a while loop on the thread.
  

Is this fixed in BizTalk 2006?  Surely it is…  And, in fact,
it sure seems to be in Beta 1.  The design of the adapter is a bit different…First, “Serial
Processing” refers to whether additional messages will be received from
the queue prior to the “EndBatchComplete” event being set (downstream
of IBTDTCCommitConfirm.Done.)  (This part of “Serial Processing” is
true for BizTalk 2004 as well, along with forcing the batch size to one.)  “Serial
Processing” in BizTalk 2006 does not affect how many threads will be reading
from your queue – you will have just one (despite what the Beta 1 docs say…),
unless you have multiple host instances in play.  (That one thread using
a large batch size and operating with serial processing set to ‘false’ – not
blocking on the actual message box submission – should keep up with a fairly
large message arrival rate, but multiple host instances might be needed for your particular
case.)

More importantly, the ProcessWorkItem implementation returns immediately after a single
peek/get/submit operation (and simply calls QueueUserWorkItem again, per the advice
cited above.)  (Side note: There seems to be some room in the design for the
idea that you in fact woudn’t return immediately if more than a threshold number
of messages were received, but currently this condition is “if # of messages received
> BatchSize”, which won’t ever happen.) 

So what should I do for now with BizTalk 2004?  For those using the
MSMQ Adapter with BizTalk 2004…consider whether you can set “Serial Processing”
equal to true.  Keep in mind this forces you to a batch size of 1, so this
might not work depending on your message arrival rate.  If you test this configuration
and find an unacceptable performance loss, consider setting the MinWorkerThreads value
to 7x the number of MSMQ receive locations you are maintaining, and
MaxWorkerThreads to roughly 10x (to provide breathing room.) 
As an alternative, spread your receive locations among multiple hosts (though avoid
an over-proliferation of hosts – that has its own issues.) 

And never draw any conclusions until you have performance tested at load with your
final host configuration – that is, your final allocation of send handlers, receive
handlers, send ports, and receive locations among your hosts!  Other adapters
may affect the outcome if they involve polling on the receive side, or polling on
the “response” side of a solit-response send port.  (If they use a thread pool
thread to do their work, they can be affected by any adapter that consumes threads
whether they themselves are written correctly or not!)  Finally, I’ve heard from
a gentlemen who has done extensive testing that the threading parameters above
are useful/necessary when using large numbers of MSMQT receive locations as well.

Never a dull day in BizTalk land…!