Don't get me wrong. I think the world of BizTalk as a product. But, like any complex piece of software, there are one or two things that annoy me. I blogged a few weeks ago about the xpath() function, and the way BizTalk decides how to initialise an internal XmlSerializer instance based on the type of the variable to which you assign the xpath() result. If you type your variable inappropriately, your code breaks at run time with a problem that is difficult to diagnose.
Another orchestration issue which caused me some suffering a few days ago is the auto-construction of orchestration variables. If you create an orchestration variable of a type that has a default constructor, BizTalk will, by default, auto-construct your variable using that constructor. I've never really understood the logic of this design. First, it seems a little inconsistent. Some types don't have a default constructor, and hence cannot be auto-constructed by BizTalk. Second, the fact that auto-construction, where applicable, is set as the default behaviour is just downright dangerous.
Here is a real-world scenario. As always, you are up against a tight deadline to complete your work. Everything is going well, but you have been asked to add a small and simple piece of functionality to a particular orchestration which, hitherto, has been working properly. You drop a loop shape into the orchestration and set it up to enumerate some values provided by a custom class. Disaster. When you run the orchestration, strange behaviour is observed. An instance of the orchestration is activated, but then…nothing. The very first orchestration shape is an expression whose first line is a trace. This line never gets executed. Worse than that, the BizTalk process has all but locked up. The CPU is maxed out at 100% and it takes a couple of minutes of patient waiting to coax the host instance into stopping. You then find it virtually impossible to get rid of the running orchestration instance. In fact, there are two service instances involved. The orchestration’s parent (it is a 'called' orchestration) receives a message via a published web service, and you have to stop the isolated host's IIS app pool in order to terminate the EPM (messaging) service instance associated with the isolated host. If you try also to suspend or terminate the orchestration service instance, the action is marked as pending. If you restart the host instance in order to complete the pending action, the process just locks up again. Because you can't get rid of the active orchestration service instance, you can't redeploy a new version of your BizTalk code.
Progress slows to a crawl. In desperation you eventually resort to manually truncating various BizTalk message box tables in order to recover from the lock-up (not a recommended strategy, though very effective). You can't trace or debug your orchestration code because none of it appears to ever run. You remove the loop that caused the problem and redeploy, and still you code doesn't work. Hours start to pass while your blood pressure slowly rises, and your deadlines come and go. You try everything you can think of, including recreating your orchestration from scratch, testing it carefully at each stage. Everything is fine until you recreate that darned loop. Again, BizTalk locks up.
The best thing to do in these situations is to walk away from the problem for a while. With a little encouragement from the project manager, this is what I did. The functionality represented by the loop was important, but I could carry on making progress without it for a while. Progress is duly made, and then, a few hours later, walking down the street after the day's work is done, my mind turns back naturally to the problem…and bingo (as we say here in the UK)…a moment of clarity and enlightenment. I know in an instant what is wrong. Next morning, it takes me about a minute to locate and fix the bug.
My loop uses a custom class. I created a variable at the outer scope of the orchestration to hold an instance of this class. The default constructor invokes some code that contains a loop. I had managed to write that loop so that it never hit the 'break' statement. So obvious. So basic. As soon as BizTalk instantiated my orchestration, it ran the code to auto-construct my custom class. The construction code hit the never-ending loop and everything locked up.
In future, I intend to adopt a new BizTalk coding standard. I shall rigorously switch off auto-construction for all my orchestration variables, and include explicit code in expression shapes to construct objects. This kind of bug takes a minute or so to diagnose when using a language like C#. In BizTalk orchestrations, it becomes opaque and very difficult to find. Worse than that, BizTalk does not provide sufficient facilities for easy recovery from the worst effects of auto-construction. This is a feature too far.