Suddenly SOA does not exist!…

…or at least that’s what Clemens Vaster thinks. A little bit radical just to explain that SOA is an Orientation, not an Architecture itself… anyways, I agree with him, and I’ve added him to my blogroll…


It’s true that there’s too much hype with SO and SOA, so I’ll try to explain my point of view of SO in 3 over-simplifyed steps:


I want to benefit of Service Orientation. What should I do?



  1. Design different elements of the architecture to be independant each other (stateless, autonomous)
  2. For each element, separate interface from implementation.
  3. For each interface, use a technology that provides platform-independence and location-independence (SOAP over HTTP is a good example, but it’s not the only one)

…and that’s all, you are Service Oriented!


Well, we can discuss that platform-independance may not be required, but that’s another post…

BAM: Tracking Portal

As we all know BizTalk Server 2004 provides a great set of enabling technologies to address a number of business problems, specifically around integration and business process management, the very nature of this technology means that BizTalk is typically deployed into the very heart of the enterprise and often has very mission critical applications running on it. With this of course comes a big responsibility in terms of how we build real enterprise class applications on BizTalk. Over the coming weeks and months I intend to push out to a broader community some of the learning’s and findings that I beleive make a real different in terms of increasing the chances that a BizTalk Server deployment will be successful, not only in terms of getting it to go-live but also around ensuring it’s cost of ownership is kept to a minimum, I’ll try to keep these all under the theme of Enterprise Architecture. We’re also doing a much bigger piece of work around this but more about that another time :-).


 


The first of these areas is the Tracking Portal, Darren has been hounding me (and rightly so) to blog some stuff on this for quite a while now, so dude, here it isfinally :-). The Tracking Portal often takes a back seat to getting the solution out of the door, the fact of the matter is however that tracking is core functionality in my opinion that needs to be designed in from the outset. A Tracking Portal really needs to surface two views into the solution, first it needs to provide a view into the business for the users of the applications sometimes this means MI but other times simply the business view of long running transactions, most people get this and also that BAM is well placed to provide this, and I have to say what great technology BAM is and it usually provides the ’wow’ to the business users. The second view is often neglected, but it’s equally as important, it needs to provide a view into the solution from a functional perspective for the solution support team to use in order to troubleshooting the live system. There are other tools like HAT that can be used to troubleshoot, but HAT does not have the understanding of what combination of message flows constitute a business transaction, and typically this is what both the business and support users care about rather than individual discrete messages.


 


The approach that we have been using over here in jolly old Blighty for a while now, is to build the Tracking Portal using BAM to capture the key state and business transaction information, and SQL Reporting Services, (another technology that I think is very cool indeed), to render the Business and Support views from the tracking database. Using this approach the Tracking Portal can be put together very cheaply with a rich reach UI, and in my experience providing these views, especially the Support view could get you out of some very difficult situations after you goes live. Ok, so let’s dig a little deeper


 


Using BAM to Capture Key Business Transaction Information


There are essentially three way you can use BAM, TPE, BufferedEventStream or DirectEventStream, the first TPE, allows you to define what tracking data is captured at specific points in an Orchestration, while this is great to setup quickly it has the problem that every time you change your Orchestration you need to redefine your TPE. The second problem is that you can’t use it in the pipeline, for some scenarios this presents a break in your tracking data between the time the business transaction reaches the adapter to the time the message was published by the messaging engine and routed to the appropriate orchestration. For me, these two reasons typical cause me to discount TPE.


The other two are very similar to each other in usage, the main difference being in the way that the tracking data is transferred to the BAMPriamryImport DB, the BufferedEventStream is asynchronously processed which essentially means is less of a performance hit, while DirectEventStream causes the tracking data to be written directly to the BAMPriamryImport database.



In general the BufferedEventStream is preferable for performance reasons. As I mentioned earlier, to get a full view of how business transitions flow it generally means that the BufferedEventStream needs to be called from a custom pipeline component in the receive pipeline, the orchestrations, and again in a custom pipeline component on the outbound path, so providing a simple stateless wrapper API around the calls to the BufferedEventStream is a good approach, this API can then be called from a pipeline component or an orchestration.


In order to tie up the individual messages into a message flow which represents a ’business transaction’, a unique business transaction ID is needed, often there is the notion of some unique identifier for message flows depending on the scenario, but in the scenarios where there isn’t, the system context property InterchangeID maybe used since it is guaranteed to be unique and will be flowed with the message even as the message is cloned through each stage in the pipeline. If the message is copied in the orchestration it is the user who responsible for ’flowing’ the InterchangeID.


Each new business transaction will start a new Activity, this is done by calling the EventStream.BeginActivity() API, an activity is ended by calling EventStream.EndActivity(), of course for business transactions that span the receive pipeline, and orchestration etc, BAM has the concept of continuing an activity, for this the EventStream.EnableContinuation() API is called passing in a unique identifier to tie the activities together. The property BufferedEventStream.FlushThreshold is used to control the frequency at which the events are flushed to the BAM database, in general setting this to ’one’ works well. Neither the DirectEventStream or the BufferedEventStream are serializable currently, the implications of using them from Orchestrations are that they need to be created each time they are used rather than defining a local variable in the Orchestration, this may sound like an issue, but in the past we’ve taken customer scenarios into the perf lab and profiled the BizTalk solution to understand the cost associated with using the EventStream in this manner, in the grand scheme of things the cost of creating it and using it with a flush threshold of one is not significant.


Pipeline components can get an EventStream from the pipeline context using IPipelineContext.GetEventStream(), this was added in QFE 1117, and is in SP1, this ensures transactional integrity between the tracking data and the message box interactions.


One final point is that it is possible to track message bodies also using BAM, though the restriction is that there is a limit to the size of the message bodies of 3885 bytes, which isn’t that big. Having said that, using this approach outlined here, we’ve measured the cost of this tracking on a complex customer scenario, and found it to be in the order of 20% slower than the same solution with no tracking, which is pretty impressive.


 


Using SQL Reporting Services to Build the Portal


Ok, so now we have a very flexible mechanism to collect tracking data for business transactions at run-time, we need a slick way to render the web based tracking portal, enter SQL Reporting Services. SQL RS can be used to build the tracking portal views (business focused and support focused) very easily, it allows you to use drag and drop techniques to build the UI and define the layout of the report. Since the portal is built on SQL RS, it’s really easy to provide rich querying capabilities as well as hierarchical views that for example allow you to look at a business transaction then drill down to the individual message flows. The approach here is to point SQL RS at the BAM views that generated when the activities are deployed, and allow SQL RS to query over these views.


 

BizUnit v2.0 is imminent…

Version 2 of BizUnit is pretty close to compleation, the new version has the notion of passing state between the individual test steps plus plenty of new test steps not ony from myself but also from a number of customers and Microsoft partners.


As I mentioned previously, for those of you that haven’t used BizUnit, it’s a framework for the rapid development of automated testing, it was initially targetted at BizTalk but it’s not by any means restricted to it. A number of my customers, past and present have had great success using it.


If you have test steps that you like to have included in Version 2, please let me know, I’m aiming to get it finished in the next month or so.

BAM: Tracking Portal

As we all know BizTalk Server 2004 provides a great set of enabling technologies to address a number of business problems, specifically around integration and business process management, the very nature of this technology means that BizTalk is typically deployed into the very heart of the enterprise and often has very mission critical applications running on it. With this of course comes a big responsibility in terms of how we build real enterprise class applications on BizTalk. Over the coming weeks and months I intend to push out to a broader community some of the learning’s and findings that I beleive make a real different in terms of increasing the chances that a BizTalk Server deployment will be successful, not only in terms of getting it to go-live but also around ensuring it’s cost of ownership is kept to a minimum, I’ll try to keep these all under the theme of Enterprise Architecture. We’re also doing a much bigger piece of work around this but more about that another time :-).


 


The first of these areas is the Tracking Portal, Darren has been hounding me (and rightly so) to blog some stuff on this for quite a while now, so dude, here it isfinally :-). The Tracking Portal often takes a back seat to getting the solution out of the door, the fact of the matter is however that tracking is core functionality in my opinion that needs to be designed in from the outset. A Tracking Portal really needs to surface two views into the solution, first it needs to provide a view into the business for the users of the applications sometimes this means MI but other times simply the business view of long running transactions, most people get this and also that BAM is well placed to provide this, and I have to say what great technology BAM is and it usually provides the ’wow’ to the business users. The second view is often neglected, but it’s equally as important, it needs to provide a view into the solution from a functional perspective for the solution support team to use in order to troubleshooting the live system. There are other tools like HAT that can be used to troubleshoot, but HAT does not have the understanding of what combination of message flows constitute a business transaction, and typically this is what both the business and support users care about rather than individual discrete messages.


 


The approach that we have been using over here in jolly old Blighty for a while now, is to build the Tracking Portal using BAM to capture the key state and business transaction information, and SQL Reporting Services, (another technology that I think is very cool indeed), to render the Business and Support views from the tracking database. Using this approach the Tracking Portal can be put together very cheaply with a rich reach UI, and in my experience providing these views, especially the Support view could get you out of some very difficult situations after you goes live. Ok, so let’s dig a little deeper


 


Using BAM to Capture Key Business Transaction Information


There are essentially three way you can use BAM, TPE, BufferedEventStream or DirectEventStream, the first TPE, allows you to define what tracking data is captured at specific points in an Orchestration, while this is great to setup quickly it has the problem that every time you change your Orchestration you need to redefine your TPE. The second problem is that you can’t use it in the pipeline, for some scenarios this presents a break in your tracking data between the time the business transaction reaches the adapter to the time the message was published by the messaging engine and routed to the appropriate orchestration. For me, these two reasons typical cause me to discount TPE.


The other two are very similar to each other in usage, the main difference being in the way that the tracking data is transferred to the BAMPriamryImport DB, the BufferedEventStream is asynchronously processed which essentially means is less of a performance hit, while DirectEventStream causes the tracking data to be written directly to the BAMPriamryImport database.



In general the BufferedEventStream is preferable for performance reasons. As I mentioned earlier, to get a full view of how business transitions flow it generally means that the BufferedEventStream needs to be called from a custom pipeline component in the receive pipeline, the orchestrations, and again in a custom pipeline component on the outbound path, so providing a simple stateless wrapper API around the calls to the BufferedEventStream is a good approach, this API can then be called from a pipeline component or an orchestration.


In order to tie up the individual messages into a message flow which represents a ’business transaction’, a unique business transaction ID is needed, often there is the notion of some unique identifier for message flows depending on the scenario, but in the scenarios where there isn’t, the system context property InterchangeID maybe used since it is guaranteed to be unique and will be flowed with the message even as the message is cloned through each stage in the pipeline. If the message is copied in the orchestration it is the user who responsible for ’flowing’ the InterchangeID.


Each new business transaction will start a new Activity, this is done by calling the EventStream.BeginActivity() API, an activity is ended by calling EventStream.EndActivity(), of course for business transactions that span the receive pipeline, and orchestration etc, BAM has the concept of continuing an activity, for this the EventStream.EnableContinuation() API is called passing in a unique identifier to tie the activities together. The property BufferedEventStream.FlushThreshold is used to control the frequency at which the events are flushed to the BAM database, in general setting this to ’one’ works well. Neither the DirectEventStream or the BufferedEventStream are serializable currently, the implications of using them from Orchestrations are that they need to be created each time they are used rather than defining a local variable in the Orchestration, this may sound like an issue, but in the past we’ve taken customer scenarios into the perf lab and profiled the BizTalk solution to understand the cost associated with using the EventStream in this manner, in the grand scheme of things the cost of creating it and using it with a flush threshold of one is not significant.


Pipeline components can get an EventStream from the pipeline context using IPipelineContext.GetEventStream(), this was added in QFE 1117, and is in SP1, this ensures transactional integrity between the tracking data and the message box interactions.


One final point is that it is possible to track message bodies also using BAM, though the restriction is that there is a limit to the size of the message bodies of 3885 bytes, which isn’t that big. Having said that, using this approach outlined here, we’ve measured the cost of this tracking on a complex customer scenario, and found it to be in the order of 20% slower than the same solution with no tracking, which is pretty impressive.


 


Using SQL Reporting Services to Build the Portal


Ok, so now we have a very flexible mechanism to collect tracking data for business transactions at run-time, we need a slick way to render the web based tracking portal, enter SQL Reporting Services. SQL RS can be used to build the tracking portal views (business focused and support focused) very easily, it allows you to use drag and drop techniques to build the UI and define the layout of the report. Since the portal is built on SQL RS, it’s really easy to provide rich querying capabilities as well as hierarchical views that for example allow you to look at a business transaction then drill down to the individual message flows. The approach here is to point SQL RS at the BAM views that generated when the activities are deployed, and allow SQL RS to query over these views.


 

BizUnit v2.0 is imminent…

Version 2 of BizUnit is pretty close to compleation, the new version has the notion of passing state between the individual test steps plus plenty of new test steps not ony from myself but also from a number of customers and Microsoft partners.


As I mentioned previously, for those of you that haven’t used BizUnit, it’s a framework for the rapid development of automated testing, it was initially targetted at BizTalk but it’s not by any means restricted to it. A number of my customers, past and present have had great success using it.


If you have test steps that you like to have included in Version 2, please let me know, I’m aiming to get it finished in the next month or so.

Schema Design Patterns: Venetian Blind

This is the third of five entries talking about schema design patterns.  In previous entries the Russian Doll approach and the Salami Slice approach were discussed.


The Venetian Blind approach is similar to the Russian Doll approach in that they both use a single global element.  The Venetian Blind approach describes a modular approach by naming and defining all type definitions globally (as opposed to the Salami Slice approach which declares elements globally and types locally).  Each globally defined type describes an individual “slat” and can be reused by other components.  In addition, all the locally declared elements can be namespace qualified or namespace unqualified (the slats can be “opened” or “closed”) depending on the elementFormDefault attribute setting at the top of the schema.  If the namespace is unqualified then the local elements in the instance document must not be qualified with the prefix of the namespace.


<?xml version=”1.0″ encoding=”UTF-8″?>
<xs:schema targetNamespace=”TargetNamespace” xmlns:TN=”TargetNamespace” xmlns:xs=”http://www.w3.org/2001/XMLSchema
elementFormDefault=”qualified” attributeFormDefault=”unqualified”>
    <xs:element name=”BookInformation” type=”TN:BookInformation” maxOccurs=”unbounded”/>
    <xs:complexType name=”BookInformation”>
        <xs:sequence>
            <xs:element name=”Title”/>
            <xs:element name=”ISBN”/>
            <xs:element name=”PeopleInvolved” type=”TN:PeopleInvolvedType” maxOccurs=”unbounded”/>
        </xs:sequence>
    </xs:complexType>
    <xs:complexType name=”PeopleInvolvedType”>
        <xs:sequence>
            <xs:element name=”Author”/>
            <xs:element name=”Publisher”/>
        </xs:sequence>
    </xs:complexType>
</xs:schema>


 


The advantages are that since all complex and simple types are defined globally they are available for reuse.  In addition, the option exists to hide the namespace prefix for all locally defined elements in the instance document.


The disadvantages are that the schema is verbose, it is not self contained and it may be coupled with other schemas.


This type of approach is good when flexibility, reuse and namespace exposure are important.  This approach uses a combination of local and global types unlike the Russian Doll approach which all components are locally declared and the Salami Slice where all components are globally declared.  This is important as it provides the flexibility to create a schema for most needs since the types can be assigned to elements and extended or restricted as needed.  This would be an appropriate design when data is transferred between diverse organizations or business units since it provides each group the flexibility for modifications for each specific requirement.

BizTalk Assembly Viewer BTSAsmExt.dll

Hey, just working through the MS Course 2158A, doing these courses is a good way to tidy up loose ends and discovered the BTSAsmExt.dll = the Biztalk Assembly Viewer.


This little utility allow one to see those assemblies and their componentswhich have been installed into the GAC but what I find it useful for is that it’s a nice & quicklittle way to view all the xml which makes up orchestrations and their correlation sets, messages & variables, maps, schemas, pipelines and physical ports (as opposed to logical orchestration ports).


It’s aslo good for:


– Seeing which assemblies have been deployed


– View assembly attributes such as version


– See if an assembly has any referenced assemblies


– Add/Remove assemblies


– Launching the Biztalk Deployment Wizard


To install it … http://msdn.microsoft.com/library/default.asp?url=/library/en-us/deploying/htm/ebiz_depl_assemblies_pade.asp


Once installed look under MyComputers in Windows Explorer for Biztalk Server Assemblies


R. Addis

TechEd 2005 Orlando Instructor Led Labs

Congratulations to everyone who is going to attend TechEd 2005 in Orlando, FL – since it is now sold out.



I just wanted to point out a few exciting instructor led labs that I will be leading.  If you like my blog content, chances are you will like these labs. 


 


These labs will give you a chance to walk through step-by-step the topics listed below.  I will also be available to answer questions.


 


Using InfoPath SP1 and SharePoint 2003 to Design an Effective Tracking System for BizTalk Server 2004


 


Abstract: Learn how to utilize InfoPath SP1 through SharePoint to display business process exception returned from within a business process. Learn how to build a simple InfoPath form to display exception information, publish this form into SharePoint, use processing instructions to reference this form, and use delivery notification to catch delivery errors.


 


 


Using Sequential Message Processing and the Flat File Disassembler to Map a Large Flat File in BizTalk Server 2004


 


Abstract: In this lab, you will learn to use the native features of the Flat-file disassembler to break up a large inbound flat file into single messages to allow for easy processing inside BizTalk.  Without this ability to break up the document, it would be difficult to process documents of this size.  In order to maintain the order of the single messages inside the Message Box, the messages will be processed sequentially using a Convoy Messaging Pattern.  Each record will be mapped and written in order to an output file. Delivery notification will be used to ensure sequencing and delivery of the single messages.


 


 


If you do not make it to my labs, you can always catch me in the cabana room or after these sessions if you have any questions or just want to say “HI”.


 


Processing Large Interchanges

Recently I’ve have had quite a few customers asking about the processing of large interchanges, and restrictions around them. Firstly, let’s make sure we’re on the same page regarding what an interchange is, it’s a single message that contains many individual messages that are disassembled in the receive pipeline, an example of this would be a flat file which is disassembled (using the FlatFile dissassembler) to produce many smaller messages, each of which are published independently.


As you’ll know BizTalk Server 2004 has a strong large message story, the engine jumps through hops in order to keep a flat memory model at run-time. If you consider the implications of this on a large interchanges they are quite interesting…


Interchanges are processed using one atomic transaction, meaning that either all of the messages disassembled from the interchange are published to the message box or none of them are. But clearly if the messages disassembled from an interchange were committed to the message box only after the interchange had been entirely disassembled, the engine would reach an out of memory condition for large interchanges. In order to keep the memory model flat the engine breaks the interchange into many sub-batches under the scope of a transaction, the transaction is committed only once the interchange has been entirely disassembled successfully. Because the interchange is sub-divided the memory for each sub-batch may be relinquished in order to keep the memory model flat. This all sounds well and good, how ever, each one of these messages keeps a SQL lock, and SQL does not have a limitless number of locks!!


For large interchanges, where large is say 100,000, it’s pretty easy to reach this lock limit when concurrently processing large interchanges. The reason being due to the multi-threaded nature of the engine it is possible to have many interchanges being processed at once. While out of the box the engine restricts the number of active threads per CPU, a context switch will cause another thread to be released and potentially kick off the processing of another large interchange, hence more locks will be taken.


If you’ve not seen it, the performance white paper has a simple equation that can be used to estimate the maximum size of an interchange:



Maximum number of messages per interchange <= 200,000 / (Number of CPUs * BatchSize * MessagingThreadPoolSize)
 


So, around this, there are a few options that you can employ to reach the out of locks scenario for large interchanges:


–         Pre-split the interchange before it hits BizTalk, the size can be determined using the formula above
–         Use 64bit SQL, I believe there are more locks
–         Write a custom adapter that pre-splits the interchange
–         Write an orchestration to split the interchange


The last point to consider is that smaller interchanges allow for more efficient processing so the raw through put will be greatly increased by using smaller interchange sizes, again there is some good data around this in the performance white paper.