I got got an email from Chirag in Nigeria with the following problem. He had a positional flat file which he needed to parse which looked like this (the ` marks the carrage return and like feed, EOF is the end of file marker remove these to test the file):
0120200300 01 THORNTON `
0120200400 01 OGIHARA `
0120200500 01 WILSON `
0080300101 JENNIFERMRS `
0080300201 BENJAMINMSTR EOF
There are two different schemas or ‘Sections’ of data here and they are identified by the first 3 digits at the start of each line 012 and 008. Cirag had written a parser in C# but he was concerned it was too slow and this was not the “BizTalk” way to do things.
1) Create a schema that looks like the one shown below:
2) Set the properties on the following nodes. The key properties which make this solution work are the Parser Optimisation, Lookahead Depth for performance the Lookahead Depth must be set to allow the parser to identify the first 3 characters of each record i.e. 012 or 008. This works in turn with the Tag Identifier property which matches different records with different areas in the schema i.e. the 012 records should be parsed into the K20 parent node area and the 008 into the K30 parent node area, this is quite a powerful piece of functionality. The most common use for the Tag Identifier property I have seen is with correctly parsing header and footer records into a schema.
Note you will need the BizTalk 2004 SP1 installed to set the Parser Optimisation, Lookahead Depth and Allow Early Termination properties in the schema properties dialog or you will need to open the schema and set these using notepad or some other editor.
Node Property Name Property Value
Schema Schema Editor Extension Flat File Extension
Parser Optimisation Speed
Lookahead Depth 3
Allow Early Termination No
AirlineMessage Structure Delimited
Child Delimiter Type Hexadecimal
Child Delimiter 0x0D 0x0A
Child Order Infix
K20 Structure Positional
Tag Identifier 012
Tag Offset 0
Max Occurs *
Min Occurs 0
PLen Positional Length 4
Key2 Positional Length 2
Key3 Positional Length 2
ChangedFlag Positional Length 1
PartyNo Positional Length 2
Remark Positional Length 32
SplName Positional Length 6
SurName Positional Length 67
K30 Structure Positional
Tag Identifier 008
Tag Offset 0
Max Occurs *
Min Occurs 0
PLen Positional Length 4
Key2 Positional Length 2
Key3 Positional Length 2
ChangedFlag Positional Length 1
Name Positional Length 67
3) Now you’ve got the schema lets test it there are two easy ways to test the schema either:
a) Right click the schema in the solution explorer choose properties set the entries in the dialog to the following
Now Right click the schema in the solution explorer again and select validate instance. If all goes ok the the output window should display something like the following:
Invoking component…
Validation generated XML output <file:///C:\DOCUME~1\ADMINI~1\LOCALS~1\Temp\_SchemaData\TestElement_output.xml>.
Validate Instance succeeded for schema TestElement.xsd, file: <file:///C:\Project\Artifacts\Biztalk Projects\UnuniformPositionalFlatFile\Test.pos>.
Component invocation succeeded.
Double click on the Validation generated XML output link and the xml output result of the file will be displayed. Then click in the pane in which the XML is displayed hold down the Ctrl key and press K and D this will format the result xml correctly so you can view it.
b) Call the the Flat File Dissasembly utility FFDASM.exe. the easiest way to do this is to create a bat file with the contents like the following:
c:
cd C:\Program Files\Microsoft BizTalk Server 2004\SDK\Utilities\PipelineTools
ffdasm.exe “C:\path\inputFlatFileName.pos” -bs “C:\path\schemaName.xsd” -c -v
pause..
The output should look like: