I got got an email from Chirag in Nigeria with the following problem. He had a positional flat file which he needed to parse which looked like this (the ` marks the carrage return and like feed, EOF is the end of file marker remove these to test the file):


0120200300 01                                      THORNTON                                                           `
0120200400 01                                      OGIHARA                                                            `
0120200500 01                                      WILSON                                                             `
0080300101 JENNIFERMRS                                                        `
0080300201 BENJAMINMSTR                                                       EOF


There are two different schemas or ‘Sections’ of data here and they are identified by the first 3 digits at the start of each line 012 and 008. Cirag had written a parser in C# but he was concerned it was too slow and this was not the “BizTalk” way to do things.


1) Create a schema that looks like the one shown below:


 



2) Set the properties on the following nodes. The key properties which make this solution work are the Parser Optimisation, Lookahead Depth for performance the Lookahead Depth must be set to allow the parser to identify the first 3 characters of each record i.e. 012 or 008. This works in turn with the Tag Identifier property which matches different records with different areas in the schema i.e. the 012 records should be parsed into the K20 parent node area and the 008 into the K30 parent node area, this is quite a powerful piece of functionality. The most common use for the Tag Identifier property I have seen is with correctly parsing header and footer records into a schema.


Note you will need the BizTalk 2004 SP1 installed to set the Parser Optimisation, Lookahead Depth and Allow Early Termination properties in the schema properties dialog or you will need to open the schema and set these using notepad or some other editor.


Node                   Property Name                        Property Value
Schema                 Schema Editor Extension              Flat File Extension 
                       Parser Optimisation                  Speed
                       Lookahead Depth                      3
                       Allow Early Termination              No


 


AirlineMessage         Structure                            Delimited
                       Child Delimiter Type                 Hexadecimal
                       Child Delimiter                      0x0D 0x0A
                       Child Order                          Infix


 


K20                    Structure                            Positional
                       Tag Identifier                       012
                       Tag Offset                           0
                       Max Occurs                           *
                       Min Occurs                           0


PLen                   Positional Length                    4


Key2                   Positional Length                    2


Key3                   Positional Length                    2


ChangedFlag            Positional Length                    1


PartyNo                Positional Length                    2


Remark                 Positional Length                    32


SplName                Positional Length                    6


SurName                Positional Length                    67


 


K30                    Structure                            Positional
                       Tag Identifier                       008
                       Tag Offset                           0
                       Max Occurs                           *
                       Min Occurs                           0


PLen                   Positional Length                    4


Key2                   Positional Length                    2


Key3                   Positional Length                    2


ChangedFlag            Positional Length                    1


Name                   Positional Length                    67


 


3) Now you’ve got the schema lets test it there are two easy ways to test the schema either:
a) Right click the schema in the solution explorer choose properties set the entries in the dialog to the following


Now Right click the schema in the solution explorer again and select validate instance. If all goes ok the the output window should display something like the following:


Invoking component…
Validation generated XML output <file:///C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\_SchemaData\TestElement_output.xml
>.
Validate Instance succeeded for schema TestElement.xsd, file: <file:///C:\Project\Artifacts\Biztalk
Projects\UnuniformPositionalFlatFile\Test.pos>.
Component invocation succeeded.
 
 
Double click on the Validation generated XML output link and the xml output result of the file will be  displayed. Then
click in the pane in which the XML is displayed hold down the Ctrl key and press K and D this will format the result xml correctly so you can view it.


b) Call the the Flat File Dissasembly utility FFDASM.exe. the easiest way to do this is to create a bat file with the contents like the following:


c:
cd C:\Program Files\Microsoft BizTalk Server 2004\SDK\Utilities\PipelineTools
ffdasm.exe “C:\path\inputFlatFileName.pos” -bs “C:\path\schemaName.xsd” -c -v
pause..


The output should look like:



 <AirlineMessage xmlns=”http://UnuniformPositionalFlatFile.Test>



   <K20 PLen=”0120 Key2=”20 Key3=”02 ChangedFlag=”0 PartyNo=”0 Remark=”01 SplName=”” SurName=”THORNTON xmlns=”” />


   <K20 PLen=”0120 Key2=”20 Key3=”04 ChangedFlag=”0 PartyNo=”0 Remark=”01 SplName=”” SurName=”OGIHARA xmlns=”” />


   <K20 PLen=”0120 Key2=”20 Key3=”05 ChangedFlag=”0 PartyNo=”0 Remark=”01 SplName=”” SurName=”WILSON xmlns=”” />


   <K30 PLen=”0080 Key2=”30 Key3=”01 ChangedFlag=”0 Name=”JENNIFERMRS xmlns=”” />


   <K30 PLen=”0080 Key2=”30 Key3=”02 ChangedFlag=”0 Name=”BENJAMINMSTR xmlns=”” />

 <AirlineMessage>

 

 R. Addis