<Any> element doesn’t catch character entities

Home Page Forums BizTalk 2004 – BizTalk 2010 <Any> element doesn’t catch character entities

Viewing 1 reply thread
  • Author
    Posts
    • #13632

      What about wrapping values with CDATA clause?

      Neal Walters
      http://Biztalk-Training.com – Learn Biztalk Faster
      http://Shareoint-Training.com – End User Courses

    • #13633

      As far as I know there is no schema definition for CDATA. It’s just a statement in XML that temporarily turns off the well-formed-ness parser.

      Neal

      • #13634

        The program that creates the XML would have to include the CData tag in the XML itself. Where does your XML come from?

        Neal

        • #13635

          Okay, let’s go back to the original issue. According to Stylus Studio, the XML is not well-formed. Even Internet Explorer browser says:

          Reference to undefined entity ‘ndash’. Error processing resource ‘file:///C:/TestNdash.xml’. Line 1, Position 37

          So rule #1 of XML is that XML has to be well-formed. How can another company send you XML that is not well-formed?

          In my opinion, they must provide well-formed XML.

          If they refuse, then you must somehow translate it to well-formed XML, probably through writing your own ReceivePipeline.

          Neal Walters
          http://Biztalk-Training.com – Learn Biztalk Faster
          http://Sharepoint-Training.com – End-User Videos

          • #13636

            I started thinking about this some more – so apparently XML doesn’t support things like &ndash; or even &nbsp; (I just now noticed your URL’s in your original message – sorry I didn’t look at them earlier.)

            Since &nbsp; is more common I googled it and found this:

            http://lists.xml.org/archives/xml-dev/200009/msg00048.html \
            &nbsp; is not a default XML entity. Either use a DTD to define it or use the
            unicode hex value directly in the entity (Š IIRC).

            http://www.xml.com/pub/a/2003/01/02/xmlchar.html
            http://xmlchar.sourceforge.net/

            http://www.dwheeler.com/essays/quotes-in-html.html

            But I’ll stick with my earlier statement that your trading partner must send your well-formed XML.

            Neal

            • #13637

              I have a [b:2146b2a22a]<Any> [/b:2146b2a22a]element in my Schema to catch any type of content –

              [code:1:2146b2a22a]<xs:element minOccurs=\"0\" maxOccurs=\"unbounded\" name=\"MixedDescription\">
              <xs:complexType mixed=\"true\">
              <xs:sequence>
              <xs:any processContents=\"skip\" />
              </xs:sequence>
              </xs:complexType>
              </xs:element>[/code:1:2146b2a22a]

              But it does not catch entities like this &ndash; –

              [code:1:2146b2a22a]<MixedDescription>….some content &ndash; … content continues</MixedDescription>[/code:1:2146b2a22a]

              XmlSpy, Stylus Studio and Visual Studio all point out that this is an undeclared charater entity.

              I came across this idea to implement entites as elments –
              http://www.topxml.com/code/default.asp?p=3&id=v20010829094626

              but there are a LOT of characters to replace which are possible Xml Character entities
              http://www.oasis-open.org/docbook/specs/wd-docbook-xmlcharent-0.3.html

              (I would have to do this in preprocessing)

              Is there a simpler way than to replace so many entities? (As I work for a publishing house any of these could be possibly expected).

              Why doesnt the <Any> element catch character entities?

              • #13638

                Good idea, and it works. But how to declare that in a Biztalk schema?

                [quote:323756d0d8=\”nwalters\”]What about wrapping values with CDATA clause?

                Neal Walters
                http://Biztalk-Training.com – Learn Biztalk Faster
                http://Shareoint-Training.com – End User Courses[/quote:323756d0d8]

                • #13639

                  Right, but how to instruct Biztalk that one node is a CDATA node and ignore its contents?

                  [quote:2194b5c23a=\”nwalters\”]As far as I know there is no schema definition for CDATA. It’s just a statement in XML that temporarily turns off the well-formed-ness parser.

                  Neal[/quote:2194b5c23a]

                  • #13640

                    OK I figured this out via this hint
                    http://searchwebservices.techtarget.com/tip/1,289483,sid26_gci879720,00.html

                    Use an XSLT library called xmlchar
                    http://www.xml.com/pub/a/2003/01/02/xmlchar.html

                    and in the instance add this line (via pre-processing) (where xmlchar is the href to the library URI)

                    [code:1:b26201e1f6]<!DOCTYPE ONIXMessage[<!ENTITY % html.4.entities SYSTEM \"xmlchar/html4-all.ent\">%html.4.entities;]>
                    <ONIXMessage xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:ch=\"http://xmlchar.sf.net/ns#\" …(other declarations) >[/code:1:b26201e1f6]

                    Don’t forget to add the xmlns:ch declaration and make sure the DOCTYPE root node is your root node.

                    Pras

                    • #13641

                      From the suppliers I dont have control over. That was the whole point.

                      [quote:14d1356a76=\”nwalters\”]The program that creates the XML would have to include the CData tag in the XML itself. Where does your XML come from?

                      Neal[/quote:14d1356a76]

Viewing 1 reply thread
  • The forum ‘BizTalk 2004 – BizTalk 2010’ is closed to new topics and replies.