The Canadian BizTalk Partner Summit is Coming!

For those of you that won’t be able to make it to the Redmond based SOA Conference in October, don’t sweat it, you won’t be left without a BizTalk related conference!


We’ve been working on this for a little while and I’m happy to be able to finally announce it publicly. The Canadian Business Process Integration (BPI) Partner summit is going to be rolling across Canada during the end of October and beginning of November.  Whether you’re new to BizTalk, experienced, sales focused or technically focused, we’re going to have something for you. The details are below. Hope to see you there.
















Overview:


Microsoft%u00ae Canada Business Process Integration (BPI) Partner Summit: Whether you run the business or write the code, the BPI Partner Summit is the event you will want to attend to help ensure your BPI business is wildly successful now, and in the future. During this event technology partners will learn about positioning their BPI business for success, architecting Service Oriented Architectures, using best practices to implement Business Process Management projects, and much more. Technology partners will have an opportunity to attend this event in any of the three cities across Canada listed below. For those attending the Toronto event we will be joined by industry expert, David Chappell who will share his insights on building a successful business around Microsoft%u00ae BizTalk%u00ae Server. 


About the Summit:
Over 2 days, attendees will learn about the strategies and technologies that Microsoft is delivering over the next year.  Two technical tracks will provide depth and detail into best practices for building BPM solutions, guidance for advancing your SOA, and using the latest development tools and technologies for connecting people, processes and information.   The Business Value track will focus on building a successful business around BizTalk and the Business Process Integration (BPI) partner competency. Whether you need to get up to speed fast, or are ready to dive deep, this is the event for technical training, networking, and business development for technology partners interested in Microsoft’s view of SOA, BPM, and Connected Systems.


When & Where:
October 17-18, 2006 – Calgary, AB
October 25-26, 2006 – Mississauga, ON
November 1-2, 2006 – Montreal, QC


Session Agenda:
Day 1:
Track 1: Business Value
The Business Value track will focus on the business opportunities for technology partners in the BPI competency.  This track has been created specifically for business owners, practice leads and others interested in growing their BPI business. 
Track 2: 101 for BizTalk – technical
The 101 track for BizTalk will provide attendees with a full-day of instructor led hands on training for BizTalk 2006.  Topics covered will include working with adapters, pipelines, maps, orchestrations, administration, etc. This track is designed to act as a foundation from which to draw from in Day 2 during the “BPI In-Depth” track and is best suited to those with limited understanding of BizTalk to-date. 
Day 2:


BPI In-Depth – technical
The BPI In-Depth track will include deep technical sessions and practical guidance on building applications using the latest technologies.  Topics covered will include Windows%u00ae Workflow Foundation, Service Oriented Architecture, Business Activity Monitoring, best practices and more.  This track is best suited for Architects, Experienced developers of BizTalk, and those who participated in the “101 for BizTalk” track. 


All sessions will be presented in English.


There is no fee to attend this training.  Please register via the Partner Learning Centre for each track you are interested in.


Questions?
Please forward your questions to [email protected].  


 


 

Fixing "SOAP / Envelope Schema" Error In BizTalk


I was stuck with a shockingly frustrating error message in BizTalk, but thanks to an obscure tweak identified in the BizTalk Developer’s Troubleshooting Guide, all is right with the world.


The error I kept getting was Document type does not match any of the given schemas. How did this come about? I receive a batch message into BizTalk, and want each item in the batch to process through the orchestration individually. No problem, just use the standard XML Receive Pipeline and an envelope schema, and *poof* you have debatching. Worked fine with a FILE adapter. Once I created a web service input channel (doing an “expose schema as web service” since the orchestration accepts an individual message, but BizTalk itself must accept the full batch) I started getting the error above. What was mind-boggling was that the schema was indeed deployed and listed in the server repository. I even ran SELECT * FROM bt_documentspec ORDER BY msgtype DESC against the MessageBox database to ensure that my namespace#root was a visible object to BizTalk. Clearly something in the generated web service was hosing me, but everything looked fine to me.


So, how did I fix this? The newsgroups were no help, as I never found anyone who got an answer to this question. After banging my head for a few hours, I took a break to read the BizTalk Developer’s Troubleshooting Guide. I see the question in the document which says Why do I receive errors when publishing my envelope schema? Hey, that’s MY problem! The solution was:

To modify the generated Web project for envelope schemas


  1. Open the .asmx.cs file.

  2. Edit the file and change bodyTypeAssemblyQualifiedName = <dll.name.version> to bodyTypeAssemblyQualifiedName = null

As soon as I did that, and reset IIS, my error went away. I’ve never seen that documented anywhere, which is odd given that this scenario doesn’t seen that outlandish. Hopefully this helps some poor soul in the future.


Technorati Tags: BizTalk

BRE: Performance Consideration – Documentation in Development

Performance Considerations


Introduction


This topic discusses how the rule engine performs in various scenarios and with different values for the configuration/tuning parameters.


Fact Types


The rule engine takes less time to access .NET facts compared the time it takes to access the XML and database facts. If you have a choice of using either .NET or XML or database fact in a policy, you should consider using .NET facts for higher performance.


Data Table vs. Data Connection


When the size of the data set is small (< 10 or so), the TypedDataTable binding performs better than the DataConnection binding. Where as, the DataConnection binding performs better than the TypedDataTable binding when the data set is large (greater than or equal to 10 rows approximately). Therefore, you should decide whether to use the DataConnection binding or TypedDataTable binding based on the estimated size of the data set.


Fact Retrievers


You can write a fact retriever-an object that implements standard methods and typically uses them to supply long-term and slowly changing facts to the rule engine before the policy is executed. The engine caches these facts and uses them over multiple execution cycles. Instead of submitting a static or fairly static fact each time the you invoke the rule engine, you should create a fact retriever that submits the fact for the first time, and then updates the fact in memory only when it is needed.


Rule Priority


The priority setting for a rule can range on either side of 0, with larger numbers having higher priority. Actions are executed in order from the highest priority to lowest priority. When the policy implements forward-chaining behavior by using Assert/Update calls, the chaining can be optimized by using the priority setting.  For example, assume that Rule2 has a dependency on a value set by Rule1.  Giving Rule1 a higher priority means that Rule2 will only execute after Rule1 fires and updates the value.  Conversely, if Rule2 were given a higher priority, it could fire once, and then fire again after Rule1 fires and updates the fact that Rule2 is using in a condition.  This may or may not result in the correct results, but clearly would have a performance impact versus only firing once. 


Update Calls


The Update function updates the fact that exists in the working memory of the rule engine and causes all the rules using the updated facts in conditions to be reevaluated. The Update function calls can be expensive especially if large set of rules need to be reevaluated because of updating the facts. There are situations where they can be avoided.  For example, consider the following rules:


Rule1:


IF PurchaseOrder.Amount > 5


THEN StatusObj.Flag = true; Update(StatusObj)


Rule2:


IF PurchaseOrder.Amount <= 5


THEN StatusObj.Flag = false; Update(StatusObj)


 


All remaining rules of the policy use StatusObj.Flag in their conditions. Therefore, when Update is called on the StatusObj object, all the rules will be reevaluated. Whatever the value of the Amount field is, all the rules except Rule1 or Rule2 are evaluated twice, once before the Update call and once after the Update call.


 


Instead, you could set the value of the flag field to false prior to invoking the policy and then use only Rule1 in the policy to set the flag. In this case, Update would be called only if the value of the Amount field is greater than 5, and Update function is not called if amount is less than or equal to 5. Therefore, all the rules except Rule1 or Rule2 are evaluated twice only if the value of the Amount field is greater than 5.


Usage of Logical OR Operators


Using an increasing number of logical OR operators in conditions creates additional permutations that expand the analysis network of the rule engine.  From a performance standpoint, you are better off splitting the conditions into atomic rules that do not contain logical OR operators.


Caching Settings


The rule engine uses two caches. The first one is in the update service and the second one is in each BizTalk process. The first time a policy is used, the BizTalk process requests for the policy information from the update service.  The update service retrieves the policy information from the rule engine database, caches it and returns the information to the BizTalk process. The BizTalk process creates a policy object based on that information and stores the policy object in a cache when the associated rule engine instance completes executing the policy. When the same policy is invoked again, the BizTalk process reuses the policy object from the cache if one is available in the cache. Similarly, if BizTalk process requests for the information about a policy from update service, the update service looks for the policy information in its cache if it is available. The update service also checks if there have been any updates to the policy in the database every 60 seconds (1 minute). If there are any updates, the update service retrieves the information and caches the updated information.


 


There are three tuning parameters for the rule engine related to these caches and they are CacheEntries, CacheTimeout, and PollingInterval. You can specify the values for these parameters either in the registry or in a configuration file. The value of the CacheEntries is the maximum number of entries in the cache. The default value of CacheEntries parameter is 32. You may want to increase the value of the CacheEntries parameter to improve performance in some cases. For example, say, you are using 40 policies repeatedly; you may want to increase the value of CacheEntries parameter to 40 to improve the performance. This would allow the update service to cache details of up to 40 policies in memory. While it would cause the BizTalk service to cache up to 40 policy instances in memory. There may be more than one instance of a policy in the cache of BizTalk service.


 


The value of CacheTimeout is the time (in seconds) for entries to age out of the update service cache. In other words, the CacheTimeout value refers to how long a cache entry for a policy is kept in the cache without it being referred. The default value of CacheTimeout parameter is 3600 seconds (1 Hr). It means that, if the cache entry is not referenced with in an hour, it is deleted. In some cases, you may want to increase the value to a higher value to improve the performance. For example, say, the policy is invoked every 2 hrs. You could improve the performance of the policy execution by increasing the value of the CacheTimeout parameter to a value higher than 2 hrs.


 


The PollingInterval parameter to the rule engine defines the time in seconds for the update service to check the rule engine database for updates. The default value for the PollingInterval parameter is 60 seconds (1 minute). If you know that the policies do not get updated at all or they are updated rarely, you could change this value to a higher value to improve the performance.


Side Effects


The ClassMemberBinding, DatabaseColumnBinding, and XmlDocumentFieldBinding classes have a property named SideEffects. This property determines if the value of the bound field/member/column value is cached or not. The default value of the SideEffects property in the DatabaseColumnBinding and XmlDocumentFieldBinding classes is false. Whereas, the default value of the SideEffects property in the ClassMemberBinding class is true. Therefore, when a field of an XML document or a column of a database table is accessed for the second time or later with in the policy, the value is retrieved from the cache. Where as, when a member of a .NET object is accessed for the second time onwards, the value is retrieved from the .NET object, not from the cache. Setting the siddeffects flag of a .NET ClassMemberBinding to false will improve the performance as the value of the field is retrieved from the cache from second time onwards. You can only do this programmatically. The Business Rule Composer tool does not expose the sideeffects flag.


Instances and Selectivity


The XmlDocumentBinding, ClassBinding and DatabaseBinding classes have two properties, Instances and Selectivity. The value of Instances property is the expected number of instances of the class in working memory. The value of Selectivity property is the percentage of the class instances that will successfully pass the rule conditions. The rule engine uses these values to optimize the condition evaluation so that the lowest possible number of instances are used in condition evaluations first and then the remaining instances. If you have prior knowledge of the number of instances of the object, setting the Instances property to that value would improve the performance. Similarly, if you have prior knowledge of the the percentage of these objects passing the conditions, setting the Selectivity to that value would improve the performance. You can only set value for these parameters programmatically. The Business Rule Composer tool does not expose them.


 

BRE: Performance Consideration – Documentation in Development

Performance Considerations


Introduction


This topic discusses how the rule engine performs in various scenarios and with different values for the configuration/tuning parameters.


Fact Types


The rule engine takes less time to access .NET facts compared the time it takes to access the XML and database facts. If you have a choice of using either .NET or XML or database fact in a policy, you should consider using .NET facts for higher performance.


Data Table vs. Data Connection


When the size of the data set is small (< 10 or so), the TypedDataTable binding performs better than the DataConnection binding. Where as, the DataConnection binding performs better than the TypedDataTable binding when the data set is large (greater than or equal to 10 rows approximately). Therefore, you should decide whether to use the DataConnection binding or TypedDataTable binding based on the estimated size of the data set.


Fact Retrievers


You can write a fact retriever-an object that implements standard methods and typically uses them to supply long-term and slowly changing facts to the rule engine before the policy is executed. The engine caches these facts and uses them over multiple execution cycles. Instead of submitting a static or fairly static fact each time the you invoke the rule engine, you should create a fact retriever that submits the fact for the first time, and then updates the fact in memory only when it is needed.


Rule Priority


The priority setting for a rule can range on either side of 0, with larger numbers having higher priority. Actions are executed in order from the highest priority to lowest priority. When the policy implements forward-chaining behavior by using Assert/Update calls, the chaining can be optimized by using the priority setting.  For example, assume that Rule2 has a dependency on a value set by Rule1.  Giving Rule1 a higher priority means that Rule2 will only execute after Rule1 fires and updates the value.  Conversely, if Rule2 were given a higher priority, it could fire once, and then fire again after Rule1 fires and updates the fact that Rule2 is using in a condition.  This may or may not result in the correct results, but clearly would have a performance impact versus only firing once. 


Update Calls


The Update function updates the fact that exists in the working memory of the rule engine and causes all the rules using the updated facts in conditions to be reevaluated. The Update function calls can be expensive especially if large set of rules need to be reevaluated because of updating the facts. There are situations where they can be avoided.  For example, consider the following rules:


Rule1:


IF PurchaseOrder.Amount > 5


THEN StatusObj.Flag = true; Update(StatusObj)


Rule2:


IF PurchaseOrder.Amount <= 5


THEN StatusObj.Flag = false; Update(StatusObj)


 


All remaining rules of the policy use StatusObj.Flag in their conditions. Therefore, when Update is called on the StatusObj object, all the rules will be reevaluated. Whatever the value of the Amount field is, all the rules except Rule1 or Rule2 are evaluated twice, once before the Update call and once after the Update call.


 


Instead, you could set the value of the flag field to false prior to invoking the policy and then use only Rule1 in the policy to set the flag. In this case, Update would be called only if the value of the Amount field is greater than 5, and Update function is not called if amount is less than or equal to 5. Therefore, all the rules except Rule1 or Rule2 are evaluated twice only if the value of the Amount field is greater than 5.


Usage of Logical OR Operators


Using an increasing number of logical OR operators in conditions creates additional permutations that expand the analysis network of the rule engine.  From a performance standpoint, you are better off splitting the conditions into atomic rules that do not contain logical OR operators.


Caching Settings


The rule engine uses two caches. The first one is in the update service and the second one is in each BizTalk process. The first time a policy is used, the BizTalk process requests for the policy information from the update service.  The update service retrieves the policy information from the rule engine database, caches it and returns the information to the BizTalk process. The BizTalk process creates a policy object based on that information and stores the policy object in a cache when the associated rule engine instance completes executing the policy. When the same policy is invoked again, the BizTalk process reuses the policy object from the cache if one is available in the cache. Similarly, if BizTalk process requests for the information about a policy from update service, the update service looks for the policy information in its cache if it is available. The update service also checks if there have been any updates to the policy in the database every 60 seconds (1 minute). If there are any updates, the update service retrieves the information and caches the updated information.


 


There are three tuning parameters for the rule engine related to these caches and they are CacheEntries, CacheTimeout, and PollingInterval. You can specify the values for these parameters either in the registry or in a configuration file. The value of the CacheEntries is the maximum number of entries in the cache. The default value of CacheEntries parameter is 32. You may want to increase the value of the CacheEntries parameter to improve performance in some cases. For example, say, you are using 40 policies repeatedly; you may want to increase the value of CacheEntries parameter to 40 to improve the performance. This would allow the update service to cache details of up to 40 policies in memory. While it would cause the BizTalk service to cache up to 40 policy instances in memory. There may be more than one instance of a policy in the cache of BizTalk service.


 


The value of CacheTimeout is the time (in seconds) for entries to age out of the update service cache. In other words, the CacheTimeout value refers to how long a cache entry for a policy is kept in the cache without it being referred. The default value of CacheTimeout parameter is 3600 seconds (1 Hr). It means that, if the cache entry is not referenced with in an hour, it is deleted. In some cases, you may want to increase the value to a higher value to improve the performance. For example, say, the policy is invoked every 2 hrs. You could improve the performance of the policy execution by increasing the value of the CacheTimeout parameter to a value higher than 2 hrs.


 


The PollingInterval parameter to the rule engine defines the time in seconds for the update service to check the rule engine database for updates. The default value for the PollingInterval parameter is 60 seconds (1 minute). If you know that the policies do not get updated at all or they are updated rarely, you could change this value to a higher value to improve the performance.


Side Effects


The ClassMemberBinding, DatabaseColumnBinding, and XmlDocumentFieldBinding classes have a property named SideEffects. This property determines if the value of the bound field/member/column value is cached or not. The default value of the SideEffects property in the DatabaseColumnBinding and XmlDocumentFieldBinding classes is false. Whereas, the default value of the SideEffects property in the ClassMemberBinding class is true. Therefore, when a field of an XML document or a column of a database table is accessed for the second time or later with in the policy, the value is retrieved from the cache. Where as, when a member of a .NET object is accessed for the second time onwards, the value is retrieved from the .NET object, not from the cache. Setting the siddeffects flag of a .NET ClassMemberBinding to false will improve the performance as the value of the field is retrieved from the cache from second time onwards. You can only do this programmatically. The Business Rule Composer tool does not expose the sideeffects flag.


Instances and Selectivity


The XmlDocumentBinding, ClassBinding and DatabaseBinding classes have two properties, Instances and Selectivity. The value of Instances property is the expected number of instances of the class in working memory. The value of Selectivity property is the percentage of the class instances that will successfully pass the rule conditions. The rule engine uses these values to optimize the condition evaluation so that the lowest possible number of instances are used in condition evaluations first and then the remaining instances. If you have prior knowledge of the number of instances of the object, setting the Instances property to that value would improve the performance. Similarly, if you have prior knowledge of the the percentage of these objects passing the conditions, setting the Selectivity to that value would improve the performance. You can only set value for these parameters programmatically. The Business Rule Composer tool does not expose them.


 

File Adapter And Partial Stream Reads

File Adapter And Partial Stream Reads

I’ve been playing for the last couple of days with BizTalk Server 2006, building a
custom encoder pipeline component. One of the things I’ve been trying to do is finding
a way to do all encoding operations in a streaming fashion, by building a pass-through
stream implementation that only reads and encodes small portions of the message as
they are read by the outbound adapter.

One of the options I’ve been experimenting with was to do partial reads and writes
on an intermediate memory stream: Instead of reading and encoding the entire body
part of the message in a single pass in memory, and then returning that from the encoder
component, I only read from the original stream as much as the adapter asks me for
on the Stream.Read() implementation, encode and write than into the intermediate memory
stream, and then read back from it and return it to the adapter.

I realize it sounds a little convoluted, but it’s the easiest way to do it with
the library I’m using to do the encoding. One of the reasons why this works fairly
well is that the encoding process does some compression, and so, it will be the case
that whatever I read and encode from the original stream will be smaller than what
originally asked for. For example, if someone tried to read 64KB from my custom encoding
stream, I might return just a couple of KB even if there’s further data in the original
stream. Granted, it is not the most efficient implementation, but it ensures I use
little memory during the encoding operations.

Note: This is not a problem in the .NET Stream API; if you’re reading from a stream
you must be prepared to deal with partial reads. A partial read does not mean
that you’ve read the end of the stream nor that a problem was encountered.

Now, I know this works, as I unit tested the encoding stream and component in isolation
(using my PipelineTesting library).
However, when I went to try my custom pipeline component in a real messaging scenario
with the File adapter, it failed, and miserably: The BizTalk host would pretty much
crash (after a huge spike of 100% processor usage) with the following error: “The
parameter is incorrent”. Humm, not much useful.

At this point I took out the debugger and attached to BTSNTSvc.exe to try and repro
the error. I was able to track my custom pipeline component getting called, and see
BizTalk read off my custom stream. At this point I noticed weird things.

The first thing I noticed was that the file adapter (or is it the BizTalk messaging
engine itself?) uses very small buffers to read of the message streams. Indeed, it
only reads it in 4KB chunks. That seems rather small to me, particularly considering
the fact that the FILE adapter is an unmanaged adapter and so each Read() call into
the stream will cause unmanaged<->managed transitions which are costly.
I would’ve expected it to at least use buffers of 64KB, but maybe there’s a good reason
for that.

That by itself was not too much of an issue; my component was perfectly capable of
dealing with that and indeed I had unit tests using both 4KB and 64KB buffers (though
I believe it is the cause of the poor performance and hight CPU usage I noticed).
The real problem was that my stream would almost always do a partial read, and the
adapter seemed unable to cope for that, as it started asking for weird buffer sizes
on consecutive Read() calls.

Let’s see a small table that explains how it called each time (this was on a run with
a 5MB file):

Iteration Buffer Size Offset Count Bytes Read
1 4096 0 4096 38
2 4096 0 4058 383
3 4096 0 3674 47
4 4096 0 3628 50
5 4096 0 3578 50
6 4096 0 3528 50

As you can probably guess by now, what seems to be happening is that if the stream
does a partial read, then the next time around the adapter asks for a read of (buffer.Size-bytesRead) length
(or close enough). Eventually, that length reaches zero if the stream hasn’t been
totally consumed, which in this case causes an exception as it is an invalid parameter
value. I’m not sure if this is a bug in the file adapter, or if it’s simply a side-effect
of the way the managed<->unmanaged code interaction happens at the messaging
engine, but I though it was something worth looking at more closely.

I’m planning on working around this by doing a some extra things to try as much as
possible to do full reads and avoid the partial ones (as the ones I’m doing now are
obviously innefficient, but that’s caused partly because the input file is highly
compressible). Hopefully that should make this a non-issue.

Technorati: BizTalk

Tutorial Updates

If you are pulling your hair out wondering why you followed the tutorial to the letter and yet things aren’t working as it should be, here is Lisa’s Blog site that provides the latest updates to the tutorial. Please make sure you download the latest tutorial from MSDN site instead of using the out-of-dated one […]