The XML
Specification
allows attribute values to be wrapped with either double quotes
(“) or single quotes (‘). Normally, BizTalk Server, as a good standard-abiding citizen, has
no problem with this. However, I had been seeing messages on the BizTalk newsgroup
claiming that BizTalk would refuse to process XML documents using single quotes when
the XML Disassembler component tried to parse the incoming message.

I spent a few minutes researching this claim and found out that indeed there is some
truth to this statement. Indeed it appears there is a bug in the XML Disassembler
when it runs into documents using single quotes, but it only affects some documents.

The first thing the XML Disassembler component does when it receives a message is
probe it to see if it is indeed and XML message and try to “guess” the encoding it
is in. As part of that it will look not only for a BOM (Byte Order Mark), but also
for an <?xml?> declaration containing an encoding attribute.

Here’s the problem: if the encoding attribute in the xml declaration is wrapped in
single quotes, the parsing fails. In other words: BizTalk has a real issue with a
document that begins like this:

<?xml version=’1.0′ encoding=’utf-8′?>

It does’t have a problem with this, however:

<?xml version=’1.0′ encoding=”utf-8″?>

In fact, as long as the encoding attribute is missing or is wrapped with double
quotes, BizTalk will happily accept and correctly parse the message. If not, BizTalk
will fail with the following error:

There was a failure executing the receive
pipeline: “Microsoft.BizTalk.DefaultPipelines.XMLReceive, Microsoft.BizTalk.DefaultPipelines,
Version=3.0.1.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35” Source: “XML disassembler”
Receive Port: “ReceivePort1” URI: “C:\temp\BizTalk\TestIn\*.xml” Reason: Length cannot
be less than zero.
Parameter name: length

Looking a bit throught the XML Disassembler code using reflector, it becomes very
clear that the Probe() method of the XmlDasmComp class (i.e. the disassembler component)
is fatally flawed, as it will attemtp to do a manual parsing of the <?xml?>
declaration and explicitly expects the encoding attribute value to be wrapped in double
quotes. Here’s the relevant bit of code:

if (text2.Contains("encoding")) { text1 = text2.Substring(text2.IndexOf("encoding")); text1 = text1.Substring(text1.IndexOf('"') + 1); text1 = text1.Substring(0, text1.IndexOf('"')); }

Unfortunately, there’s no easy workaround without actually modifying the incoming
message either by hand or through a custom decoding pipeline component that can fix
it..