by stephen-w-thomas | Feb 10, 2005 | Stephen's BizTalk and Integration Blog
I have talked a lot in the past about Convoys. But, under the covers what really makes Convoy message processing different? What makes a Convoy a Convoy?
To sum it all up into one sentence: Convoys require database level message-to-Orchestration correlation independent of any running Orchestration instances.
Is this confusing? Yes. But trust me, more detailed information is coming soon on this topic. For now, I want to briefly go over the basic internals of how Convoys work in BizTalk Server 2004.
When you deploy an Orchestration:
The presence of a Convoy is detected at the time a BizTalk 2004 Orchestration is enlisted. At this point, an entry is inserted into the ConvoySets table inside the Message Box that lists the properties used in the Convoy Set. This is limited to only three properties because each property is listed in a column inside the table. The subscriptions that can activate this Convoy are marked with the uidConvoySetID for that Convoy Set.
When a message arrives:
Every time a new message arrives into BizTalk it is checked to see if it is part of a Convoy or if it should start a Convoy. If it should start a Convoy, an entry is inserted into the ConvoySetInstances table that routed messages to the correct instance. If it is part of an existing Convoy (i.e. it has the sample Convoy properties of an existing Convoy set), it is routed to the Service Instance listed inside the table. Otherwise, it just follows the normal subscription process.
Convoy Virtual Subscription:
This database level message routing creates a type of virtual subscription on top of the standard subscriptions inside the message box. This is set up internally using a series of tables.
I have put together a simple SQL query to view all Orchestrations that belong to a Convoy and what the Convoy virtual subscription properties are for those instances. I have not tested this code under all Convoy scenarios so please use caution when running the query and it may yield unexpected results. Note that for this viewer to return any data you must have running Orchestration instances that are part of a Convoy.
DOWNLOAD: Sample Convoy Virtual Subscription Viewer
CRITICAL: This application and included SQL will query critical BizTalk SQL Tables. This should not be run on production servers. Review all code prior to execution and use extreme caution when viewing any BizTalk related database tables. Use this code at your own risk.
This sample is also an excellent example of why I am a BizTalk Developer rather then a .net Developer :).
Here are some of my samples that use Convoys that can be used with this viewer.
DOWNLOAD: Concurrent Receive Convoys (post date 11/5/2004)
DOWNLOAD: Sequential Receive Convoys (post date 8/23/2004)
by stephen-w-thomas | Feb 10, 2005 | Downloads
This tool can be used to see the Convoy properties associated with any running Orchestration processing messages as a Convoy. Note that this queries critical BizTalk tables. Run with care. This tool works with BizTalk 2004 and BizTalk 2006.
Get more information from the original blog post on this topic: http://www.biztalkgurus.com/blogs/biztalk/archive/2005/02/11/Convoy-Subscription-Viewer-for-BizTalk-Server-2004.aspx
by stephen-w-thomas | Feb 3, 2005 | Stephen's BizTalk and Integration Blog
This is a cool feature of BizTalk 2004! It allows you to map many documents into one or map one document into many. In either case, your inputs will not equal your outputs. This type of mapping does require some up front planning and careful consideration.
First off, this type of mapping is only allowed inside the Orchestration. Basically, what the mapper does is creates a kind of multi-part message for either the input, output, or both. The parts of the multi-part message are the different input/output messages. This allows the mapper to take in or produce multiple messages.
I have only been able to create these types of maps by creating new maps using the Transform Shape inside the Orchestration. I have tried to manually create them, but I have not been able to create/replicate the multi message behavior. Also, I have not been able to find the mystery schema that the Orchestration generates when creating these new maps. Surely is has to be someplace…
How to create maps with multiple messages?
1. Create Orchestration Messages inside your Orchestration. Lets say Input1, Input2, Output2, and Output2.
2. Add a Transform Shape to your Orchestration.
3. Select Output1 and Output2 inside the Construct Shape as Constructed Messages.
4. Click on the Transform Shape. It will look like the figure below.
5. Add the needed input message under Source and output messages under Destination. Each messages needs to go on a new line inside the Transform Shape. In this case, Input1 and Input2 are added to Source and Output1 and Output2 are added to Destination.
6. Once completed, make sure the “Launch the BizTalk Mapped” is checked and hit OK.
7. The mapped will open and you will see multi messages on the Source and Destination. These are nothing more then a new schema that includes the schemas from the input and output messages.
Note: Be mindful of namespaces as I *think* they are required on all the multiple messages used by the new map.
The results would look something like this.
8. Map as needed.
You can modify an existing map to use multi messages but it will break any existing links. If you want to modify an existing map, it has to be in the same project in order to modify the map to allow multi messages. This is done by opening up the Transform Shape and adding additional messages to a new line under Source or Destination. Once modified, then it can be moved to another project.
Note: Moving the map might mess up all the namespaces and references so manual editing of the XSLT may be required. So be careful! Moving the maps works best if all the schemas used inside the map are referenced from external schemas project.
Overall: Mapping with multi messages is an extremely powerful feature when used inside the Orchestration. Although, it can be difficult to convert existing maps into maps that use multi messages.
by stephen-w-thomas | Jan 27, 2005 | Downloads
This map sample shows how to suppress nodes inside a map by passing false from a logical functoid. The blog post below gives more detail on the sample.
The mapper in BizTalk 2004 makes it easy to suppress nodes inside the mapper. The key is to send the looping node a “false” from a Logical Functoid. I have tried sending a Boolean False value from a Scripting Functoid but that was unsuccessful.
Get more information from the original blog post on this topic: http://www.biztalkgurus.com/biztalk_server/biztalk_blogs/b/biztalk/archive/2005/01/27/how-to-suppress-nodes-in-the-biztalk-mapper.aspx
by stephen-w-thomas | Jan 26, 2005 | Stephen's BizTalk and Integration Blog
Recently on a project I was required to filter out single records from a large batch that were not required to be processed inside the business process. In this case, the vast majority of the thousands of input records would not be required in the Orchestration.
Rather the load the whole document into the Orchestration or break the document up into single messages, I decided to just suppress the non-required nodes in the mapper. This would greatly reduce the number of input records into the system.
The mapper in BizTalk 2004 makes it easy to suppress nodes inside the mapper. The key is to send the looping node a “false” from a Logical Functoid. I have tried sending a Boolean False value from a Scripting Functoid but that was unsuccessful.
This would look like this:
In this case, if the Logical Equals is “false” the Info node will not be created. This can be very handy if you want to suppress records for items that are out of stock or for records that do not need to be processed inside the Orchestration.
Note that in this case the Looping Functoid is not required. Removing it does not change the results. The Node is still suppressed if the Logical Equals returns “false”. I always include the Looping Functoid when looping records.
I have put together a sample that shows a simple suppression and a more complex scenario.
Download: Suppress Nodes Sample
by stephen-w-thomas | Jan 26, 2005 | Stephen's BizTalk and Integration Blog
The long awaited Service Pack 1 for BizTalk 2004 is now available.
Get some more details on Scott’s Blog.
Go directly to the Microsoft Download Site.
by stephen-w-thomas | Jan 17, 2005 | Stephen's BizTalk and Integration Blog
The BizTalk 2004 Exam is now available!
The exam number is 074-135: Developing E-Business Solutions Using Microsoft BizTalk Server 2004 (English).
Get some more details on Scott’s Blog.
Best of luck!
PS: Sorry for the long delay in posting. More content and samples are coming soon!
by stephen-w-thomas | Dec 17, 2004 | Stephen's BizTalk and Integration Blog
I have seen a lot of posts on various news groups over the past few months about the Flat File Dissembler and how it produces output. I think it is rather confusing so I put together a sample that I hope will shed some light on the subject.
Download: Flat File Disassembler Output Sample
Watch the video: Flat File Disassemblier Output Options Video
The Flat File Disassembler is used to convert flat file documents (either positional, delimited, or a supported combination) into XML based on a defined schema. The schema must import the Flat File Schema Extensions and have all the required delimiters and positions set, of course. Flashback: This type of conversion was accomplished using envelops in BizTalk 2002.
The Flat File Disassembler can take in a large batch file and produce either one large XML output or single record Xml outputs. This is the confusing part… The control of this is based on how the schema is defined and how the properties are set on the Flat File Disassembler inside the Receive Pipeline.
Producing One XML Output
In order to produce one output file, simply define a single schema with the required Header, Body (make sure this is set to 1 to many), and Trailer records. Then, set the Document Schema property inside the Receive Pipeline Disassembler component to this schema. Do not set anything for the Header or Trailer Schema. This will produce one output based on the input file.
In my sample, this is illustrated in schema AllAsOne.xsd and AllAsOneNoHeader.xsd. The accompanying pipelines are recAllAsOne.btp and recAllAsOneNoHeader.btp.
Producing Single Message Output
In order to produce a single XML document per input record, the Header, Body, and Trailer records will need to be defined as separate schemas. Then, each of these will need to set accordingly inside the Receive Pipeline Disassembler component. The base Body message should be set to the Document Schema property.
In my sample, this is illustrated in schema AllAsOne.xsd and AllAsOneNoHeader.xsd. The accompanying pipelines are recAllAsOne.btp and recAllAsOneNoHeader.btp.
Inside the sample, pay special attention to the Receive Pipelines. Note the differences in the setting and the schema to return a single record verses one file. The sample includes both a flat file with a Header and one with just Body records. To run the sample, see the ReadMe.txt. I have included 4 Orchestrations to allow for easy Receive and Send Port creation.
by stephen-w-thomas | Dec 12, 2004 | Stephen's BizTalk and Integration Blog
Download This Article and Sample Code Here: Debatching Options and Performance Considerations in BizTalk 2004 White Paper and Sample Code
Related Sample: Xml Envelope Debatching
In some business scenarios you may be required to receive a batch file that must be broken up and processed as single messages by BizTalk Server 2004. This could allow for individual record level processing, selective content based routing, or single message mapping.
General Debatching Design Considerations
Here are some general design considerations you might want to think about when planning a debatching approach with BizTalk Server 2004.
General Debatching Design Considerations
- Header and Trailer Validation Requirements for Flat Files
- All or Nothing Debatching
- Should single records be allowed to fail some type of validation, mapping, or routing
- Mapping Required
- As a whole message
- As single messages
- Last Message Indicator Required
- Is another process depending on knowing when the last message finished
- Ordered Message Processing
- Time of Day Processing
- Will this affect other processes running
- File Size (less important than in past versions of BizTalk)
- Record Volume
Although this article is focused on general debatching options, I want to go into more detail on some of the design considerations above.
Header and Trailer Validation
Typically batch files are received in a format other than XML, such as flat file. In this case, the file will typically have a Header and Trailer record. These records typically contain information about the number of records in the file, date, time, etc. In some business processes this information needs to be validated prior to processing the file. This creates some interesting design challenges.
Some options for this type of validation include a pre-debatching map, validation of the file using .net, or validation inside an Orchestration. The best option depends on message size since some batch files can be very large (I consider very large as greater than half a gigabyte as XML).
Last Message Indicator Required
Debatching implies breaking up a file into many different pieces. Sometimes, the single items must still behave as a batch. Some business processes require knowing when the last message in the batch has been processed to activate another process. In addition, sometimes ordered processing of the debatching messages is required.
Ordered Message Processing
Ordered processing of the messages can be accomplished in a few ways. One way is to use an ordered delivery supported adapter, like MSMQt. This would require the debatcher to write the messages in the correct order to the queue for processing. This may also require the use of a convoy to route all the single messages to the same business process. The challenge is to allow for ordered delivery without significantly affecting performance.
BizTalk 2004 Debatching Options
BizTalk 2004 provides us with several different methods for batch file debatching. What is the best way to split up your files? As always, that depends on your exact business scenario.
In this posting, I will look at four different debatching options, review the performance data, and explain the benefits of each type. I also have the test Orchestrations I used available for download. I do not provide any sample files, but you can make your own since the Orchestrations use untyped messages. Just make your structure like: DataData.
The four methods I will cover are:
- Receive Port Pipeline Debatching
- Orchestration XPath Debatching
- Orchestration Atomic Scope Node List Debatching
- Orchestration Outside .Net Component Debatching
Debatching Options Performance Test Results
Here are the results of the performance tests on each type of debatching. Tests were run on a 2.4 GHz desktop with 1.25GB RAM. The sample file produced single records that were 3 KB each. No mapping was done on the files and times do not include time to send the files using a Send Port. This is just the time to run the pipeline or orchestrations. Throughput is intended to show a general idea of the amount of data running through the process; it is not the overall system throughput. Additional tests were run for XPath and Node List that produced larger output files of 29.9 KB and 299.0 KB.
Type |
XML Size (MB) |
# Msg |
Time (Sec) |
Msg/Sec |
Msg Size (KB) |
Throughput (KB/sec) |
Receive Port |
1.6 |
500 |
8 |
62.5 |
3.0 |
205 |
Receive Port |
3.6 |
1100 |
14 |
78.6 |
3.0 |
263 |
Receive Port |
7.2 |
2200 |
34 |
64.7 |
3.0 |
217 |
Receive Port |
18.1 |
5500 |
59 |
93.2 |
3.0 |
314 |
Receive Port |
128.6 |
38500 |
603 |
63.8 |
3.0 |
218 |
XPath |
1.6 |
500 |
121 |
4.1 |
3.0 |
14 |
XPath |
3.6 |
1100 |
200 |
5.5 |
3.0 |
18 |
XPath |
7.2 |
2200 |
667 |
3.3 |
3.0 |
11 |
XPath |
18.1 |
5500 |
3077 |
1.8 |
3.0 |
6 |
Node List |
1.6 |
500 |
9 |
55.6 |
3.0 |
182 |
Node List |
3.6 |
1100 |
21 |
52.4 |
3.0 |
176 |
Node List |
7.2 |
2200 |
30 |
73.3 |
3.0 |
246 |
Node List |
18.1 |
5500 |
225 |
24.4 |
3.0 |
82 |
Node List |
54.3 |
16500 |
1460 |
11.3 |
3.0 |
38 |
Node List |
128.6 |
38500 |
15256 |
2.5 |
3.0 |
9 |
.Net Call |
1.6 |
500 |
49 |
10.2 |
3.0 |
33 |
.Net Call |
3.6 |
1100 |
220 |
5.0 |
3.0 |
17 |
.Net Call |
7.2 |
2200 |
663 |
3.3 |
3.0 |
11 |
.Net Call |
18.1 |
5500 |
3428 |
1.6 |
3.0 |
5 |
.Net Call |
54.3 |
16500 |
27000 |
0.6 |
3.0 |
2 |
Type |
XML Size (MB) |
# Msg |
Time (Sec) |
Msg/Sec |
Msg Size (KB) |
Throughput (KB/sec) |
XPath |
12 |
400 |
232 |
1.7 |
29.9 |
53 |
XPath |
35.9 |
1200 |
870 |
1.4 |
29.9 |
42 |
Node List |
12 |
400 |
10 |
40.0 |
29.9 |
1229 |
Node List |
35.9 |
1200 |
28 |
42.9 |
29.9 |
1313 |
Node List |
107.7 |
3600 |
128 |
28.1 |
29.9 |
862 |
Type |
XML Size (MB) |
# Msg |
Time (Sec) |
Msg/Sec |
Msg Size (KB) |
Throughput (KB/sec) |
XPath |
14.9 |
50 |
40 |
1.3 |
299.0 |
381 |
XPath |
59.6 |
200 |
430 |
0.5 |
299.0 |
142 |
XPath |
119.2 |
400 |
1849 |
0.2 |
299.0 |
66 |
Node List |
14.9 |
50 |
8 |
6.3 |
299.0 |
1907 |
Node List |
59.6 |
200 |
27 |
7.4 |
299.0 |
2260 |
Node List |
119.2 |
400 |
126 |
3.2 |
299.0 |
969 |
Debatching Options Detailed Overview
Receive Port Pipeline Debatching – A.K.A. Envelope Processing
This type of debatching requires defining an envelope as the basic structure of your message. This is handled a little differently depending on if your input is XML or Flat File. For native XML messages, you must define a Body node that will be broken out by the pipeline component. When receiving Flat Files, life is easier since you have control over the final schema structure to be broken out.
Using this type of debatching, it will not be possible to determine when all of the large messages have been sent or processed without considerable effort (i.e. like using a convoy that will degrade performance).
For more information and a sample of this type of debatching please see my past post on this topic.
Pros: Fast! Fast! Fast! I am guessing this is because it is all streaming and uses “streaming XPath”. Great for messaging only solutions that require content based routing of single messages.
Cons: All or nothing debatching in the pipeline. Your XML schema must have a top level node for the envelope to strip out the single messages under it. Since the message is not persisted to the database until after the pipeline and map, if something fails in the pipeline or map the entire message will be lost. In addition, I always have a hard time getting envelopes to work correctly. I think that is just user error on my part.
Mapping: Maps are applied to the single messages. If one fails, the whole batch is failed. This limits your flexibility.
Orchestration XPath Debatching – (Best Bet!)
This is my favorite type of debatching. This method of debatching comes from Darren Jefford’s Blog. I like it the best because it provides the most control over the process. I know exactly when the last message has finished. This would be useful if you are using this type of debatching to make a .net call or web service submission for each debatched message inside the loop. Just remember this will lower the performance and you will be running sequentially.
I was shocked at the relatively poor performance of this debatching. When I was testing smaller files, under 100 single messages, I was getting 10+ messages per second.
Even with the slower performance at higher output message sizes, this is still my preferred method of debatching when message size permits. Simple reasons: control and manageability!
Just be careful, I ran a 128 MB file through this and after 1 hour I only had 1500 message out. I think the slower performance is from the XPath call itself inside the loop. I think it is rescanning the whole message each time I run the loop.
Pros: Excellent flexibility inside the Orchestration. Great control of the processing of your document! This process is sequential and ordered by default. Ability to loop over anything you can XPath and can easily (this is always a relative term) build mini-batches if needed. This is something Receive Pipelines are not able to do without extensive custom code.
Cons: Performance degrades quickly as message size increases. Downright slow on large messages and a resource hog. In some cases, the sequential and ordered processing of this type of debatching may be limiting to the process.
Mapping: Complete flexibility. Map before the loop on the whole message or inside the loop as single messages. Inside the loop, single items can fail mapping and if handled correctly will not stop document processing.
Orchestration Atomic Scope Node List Debatching
This is conceptually a combination of the XPath and Receive Port Debatching. You have the flexibility to loop around anything you can XPath but your process is all or nothing. This must be done inside an Atomic Scope shape since Node List is not serializable.
This type of debatching seems to be more sensitive to output message size rather than input message size. That would make sense, since the smaller the message the more messages the Atomic Transaction will be building up.
To accomplish this debatching, I set up a Node List and an Enumerator inside the Orchestration. Then, I use MoveNext inside a loop to extract out the single message for the current node. This involved casting the object to a Node and getting the value using OuterText. For complete details, see the samples provided.
Pros: Fast! In some cases, the Atomic Scope may be beneficial.
Cons: All or nothing debatching since you must use an Atomic Scope shape. Massive resource hog! My CPU was maxed at 100% the entire time the process was running. In addition, this seems to tax SQL Server. After running some processes it needed to be restarted or the computer just would not do anything.
In one test, the process ran for over 8 hours maxing out the CPU the whole time just to have something fail and it all roll back.
Mapping: Map before the loop on the whole message or inside the loop as single messages.
Orchestration Outside .Net Component Debatching
This debatching uses an outside .net component to break up the message. The thought here is that the message will not be scanned for each loop. As the performance numbers show, using an outside .net component did not increase performance.
Inside the Orchestration, I created a new instance of a helper class and passed in the input XML message. Then, I looped over the document using a Node List. I returned items based on an index I passed in from a loop shape inside an Orchestration. The performance of this is nearly identical to the XPath method.
Are there any better ways to do it? I looked into using XmlNodeReader inside the .net component but I did not try to get it to work.
Pros: None. I would not recommend this approach.
Cons: Slow. This adds an additional component to maintain outside of BizTalk.
Mapping: Mapping can be done on the whole message or on single messages.
Conclusion
BizTalk Server 2004 offers a variety of options for breaking up large files for processing. An evaluation of your exact business requirements will help you decide on the option that is best for you.
I welcome questions, comments, feedback, and other ways to debatch large messages! Just drop me an email.
by stephen-w-thomas | Nov 17, 2004 | Stephen's BizTalk and Integration Blog
What? How? When? Why? Useless? You have no clue what I am talking about?
What are BTXTimerMessages?
They are messages BizTalk uses internally to control timers. This includes the delay shape and scope shapes with timeouts.
How will you see BTXTimerMessage?
You will see these messages in HAT. They will be associated with running Orchestations. If they show up, they will be in the Delivered, Not Consumed (zombie) message status.
When will you see BTXTimerMessages?
I see these messages when I am working with parallel actions, atomic scope, and convoys. My sample, Limit Running Orchestrations, produces these types of message. I have not been able to pin point the exact shapes or actions that cause these messages to show up.
Why do you see BTXTimerMessages?
Good question. I have a theory, but it is probably wrong. It is that the timer message is returned to the Orchestration after if has passed the point of the delay or scope shape. Thus, it is never consumed by the Orchestration.
Are these messages useless?
I think so. I always ignore them. They will go away when the Orchestration completes or is terminated. These do not seem to act like normal zombies in that they do not cause the Orchestration to Suspend.
Ok, so you have no idea what I am talking about?
Let me fill you in. BTXTimerMessages in the Delivered, Not Consumed status are sometimes seen in HAT when working with Timers. I have not really determined why they happen, but I suspect they should not show up in HAT at all. I do not think they hurt anything and I pay little attention to them. When the Orchestration finally ends or is terminated these messages simply go away. They are annoying and in long running transactions can start to stack up in HAT. Although, if you try to terminate these messages they will take down the running Orchestration with them.