BizTalk XmlDisassembler Problem with <?xml?>

If you’re working with BizTalk, then you owe it to yourself to go read Tomas
Restrepo’s latest post
.  In short, he has located a bug in how the XmlDisassembler
handles the <?xml?> declaration to determine encoding.

If the XmlDisassembler pipeline component encounters <?xml version=’1.0′
encoding=’UTF-8′?> it will throw an exception which has nothing to do with the
real problem.  It has no problem with <?xml version=’1.0′ encoding=”UTF-8″?>
though.  Go read Tomas’ article for the full details.  Great job Tomas in
chasing this down!

BizTalk and Single-Quoted Attributes in XML

BizTalk and Single-Quoted Attributes in XML

The XML
Specification
allows attribute values to be wrapped with either double quotes
(“) or single quotes (‘). Normally, BizTalk Server, as a good standard-abiding citizen, has
no problem with this. However, I had been seeing messages on the BizTalk newsgroup
claiming that BizTalk would refuse to process XML documents using single quotes when
the XML Disassembler component tried to parse the incoming message.

I spent a few minutes researching this claim and found out that indeed there is some
truth to this statement. Indeed it appears there is a bug in the XML Disassembler
when it runs into documents using single quotes, but it only affects some documents.

The first thing the XML Disassembler component does when it receives a message is
probe it to see if it is indeed and XML message and try to “guess” the encoding it
is in. As part of that it will look not only for a BOM (Byte Order Mark), but also
for an <?xml?> declaration containing an encoding attribute.

Here’s the problem: if the encoding attribute in the xml declaration is wrapped in
single quotes, the parsing fails. In other words: BizTalk has a real issue with a
document that begins like this:

<?xml version=’1.0′ encoding=’utf-8′?>

It does’t have a problem with this, however:

<?xml version=’1.0′ encoding=”utf-8″?>

In fact, as long as the encoding attribute is missing or is wrapped with double
quotes, BizTalk will happily accept and correctly parse the message. If not, BizTalk
will fail with the following error:

There was a failure executing the receive
pipeline: “Microsoft.BizTalk.DefaultPipelines.XMLReceive, Microsoft.BizTalk.DefaultPipelines,
Version=3.0.1.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35” Source: “XML disassembler”
Receive Port: “ReceivePort1” URI: “C:\temp\BizTalk\TestIn\*.xml” Reason: Length cannot
be less than zero.
Parameter name: length

Looking a bit throught the XML Disassembler code using reflector, it becomes very
clear that the Probe() method of the XmlDasmComp class (i.e. the disassembler component)
is fatally flawed, as it will attemtp to do a manual parsing of the <?xml?>
declaration and explicitly expects the encoding attribute value to be wrapped in double
quotes. Here’s the relevant bit of code:

if (text2.Contains("encoding")) { text1 = text2.Substring(text2.IndexOf("encoding")); text1 = text1.Substring(text1.IndexOf('"') + 1); text1 = text1.Substring(0, text1.IndexOf('"')); }

Unfortunately, there’s no easy workaround without actually modifying the incoming
message either by hand or through a custom decoding pipeline component that can fix
it..

Another Factory – WebClient Software Factory

Another Factory – WebClient Software Factory

The P&P team are at it again and have released the first drop of the next factory – The Web Client Software Factory. Theres a note on Ed Jezierski’s blog . The factory aims to provide guidance on web development using .NET 2.0 (ASPX, WF, Ajax etc) and will include guidance around security. No explicit mention is made of whether they will include material from the  ASP.NET 2.0 Internet Security Reference Implementation but i assume it will pull together all of the relevant guidance into one roof. Now how about one for Biztalk? 🙂

Code Generation

I am a huge fan of code generation.  I’ve used it to incredible advantage. 
I’m talking about real dollar value impacting effect that could make or break a project. 
If you’re reading this and haven’t yet drunk the Kool-Aid, then now is the time. 
Read on.

Code Generation is the process of letting code write code, usually at
design-time.  There are tons of examples on the net to demonstrate this and I
won’t repeat them here (follow some links below and you’ll get to some fine examples). 
Instead I’m going to discuss the concept pure theoretically.

When to “CodeGen” : The criteria for this are quite simple. 
If you have something which already defines the structure you desire and you are trying
to reflect that same structure in something else, then you should use code generation. 
This is why data access layers are the classic example, because you are mimicking
the structure of the database in your data access layer.

Why to “CodeGen” : Time and Agility!  The time is takes to setup
a code-generation system is a fragment of the amount of time it takes to write the
same code for all source entities.  Your template takes slightly longer than
the time it takes for one class, but can then be instantly be applied to all your
source entities.  If you have a database with 50 tables, write one class, turn
that one class into a template and bingo … 50 classes.

Agility is the other reason.  Let’s say you code all 50 classes by hand, and
then the next day the DBA refactors your database and adds a new table which has foreign
keys to lots of other tables.  Your 50 classes, pretty much toast.  If you
have generated those 50 classes using code generation, then this is no problem, simply
re-run your same template (no changes there, it’s agnostic to what database it’s looking
at) and bingo, 51 classes all with the right relationships.

What to “CodeGen” : Data Access Layers are popular, but so are lots
of other things.  Databases make a good structured source of entities you could
generate against, but so are XSD Schemas, XML Files, and lots of other stuff.

Where to “CodeGen” : There are lots of great tools to use for
code generation out there, so many I could likely exhaust my fingers typing them all,
but there are just a few which come with my “Stamp of Approval”:

  • CodeSmith
    From CodeSmith Tools this is, in my opinion, the greatest code generator going. 
    I use it for almost all my code generation.  The syntax is very much like ASP.NET
    and is easy to manage.  The product is also easy to integrate into a nightly
    build process.
    • .NET Tiers – This is a set of templates create
      by the community for use with CodeSmith.  These templates will generate stored
      procedures, data access layers, business objects and web controls all from your database
      at the push of a button.  Oh, and it all has Unit Tests.  Check this out.
  • XSDObjectGen –
    One of the few things CodeSmith does not do well, yet, is generate from XSDs. 
    XSDObjectGen creates Serializable classes which represent your XSD’s structure completely. 
    XSD.exe, which comes with Visual Studio, does similar things, but XSDObjectGen does
    it 10x better.

Office Developer Training in October

Jonsie was out of the gates promoting some Office Developer Training that we have coming up in Auckland, Christchurch and Wellington this month.


We organisied the training primarily for ISV’s that are building products supporting the 2007 Office System.


There are a few extra seats available for developers interested in learning about Office. Check in here to learn more… and be quick if you’d like to secure a place.