BizTalk Patterns part 1 – Asynchronous Aggregation

Developing a BizTalk Server solution can be challenging, and especially complex for those who are unfamiliar with it. Developing with BizTalk Server, like any software development effort is like playing chess. There are some great opening moves in chess, and there are some great patterns out there to start a solution. Besides being an outstanding communications tool, design patterns help make the design process faster. This allows solution providers to take the time to concentrate on the business implementation. More importantly, patterns help formalize the design to make it reusable. Reusability not only applies to the components themselves, but also to the stages the design must go through to morph itself into the final solution. The ability to apply a patterned repeatable solution is worth the little time spent learning formal patterns, or to even formalize your own. This entry looks at how architectural design considerations associated to BizTalk Server regarding messaging and orchestration can be applied using patterns, across all industries. The aim is to provide a technical deep-dive using BizTalk Server anchored examples to demonstrate best practices and patterns regarding parallel processing, and correlation.

The blog entry http://blogs.msdn.com/b/quocbui/archive/2009/10/16/biztalk-patterns-the-parallel-shape.aspx examines a classic example of parallel execution. It is consists of de-batching a large message file into smaller files, apply child business process executions on each of the smaller files, and then aggregating all the results into a single large result. Each subset (small file) represents an independent unit of work that does not rely or relate to other subsets, except that it belongs to the same batch (large message). Eventually each child process (the independent unit of work) will return a result message that will be collected with other results, into a collated response message to be leveraged for further processing.

This pattern is a relatively common requirement by clients and systems that interact with BizTalk Server, where a batch is sent in and they require a batch response with the success/failure result for each message. With this pattern one can still leverage the multithreaded nature of BizTalk’s batch disassembly while still conforming to requirement of the client.

Let’s closely examine how this example works. First, it uses several capabilities of BizTalk that are natively provided – some that are well documented, and some that are not. A relatively under-utilized feature – that BizTalk Server natively provides, is the ability to de-batch a message (file splitting) from its Receive pipeline. BizTalk Server can take a large message with any number of child records (which are not dependent with one another, except for the fact that they all belong to the same batch), and split out the child records to become independent messages for a child business process to act on. This child business process can be anything – such as a BizTalk Orchestration function, or a Windows Workflow.

BizTalk Server uses the concept of envelopes, which are BizTalk Schemas to define the parent and child data structures. These schemas are then referenced in a custom receive pipeline’s disassembler properties. This capability works on both XML and Flat file type messages.
Two schema envelopes (a parent and a child) are normally required to instruct BizTalk Server on how to split the file. However, using one schema is also possible.

To learn how to split a large file (also known as as de-batching) into smaller files, please refer to the BizTalk Server 2006 samplesSplit File Pipeline example. Even though the sample is based on BizTalk Server 2006 (R1) – it remains relevant for BizTalk Server 2006 R2, BizTalk Server 2009, and BizTalk Server 2010.
Note: There are several examples on how to de-batch messages, such as de-batching from SQL Server, as communicated on Rahul Garg’s blog on de-batching.

After the large file has been split into smaller child record messages, child business processes can simultaneously work on each of these smaller files, which will eventually create child result messages. These child result messages can then be picked up by BizTalk Server and published to the BizTalk MessageBox database.

Once the messages are placed into the MessageBox database, they can be moved into a temporary repository, such as a SQL Server database table.
Messages collected into the database repository, can be collated, and even sorted by a particular identifier, such as the parent (the original large batched message) GUID.

To aggregate the results back to the same batch, correlation has to be used. BizTalk Orchestration provides the ability to correlate. Aggregation with Orchestration is relatively easy to develop. Using Orchestration, however, may require more hardware resources (memory, CPU) than necessary. One Orchestration instantiation will be required per batch (so if there’s 1000 large batch messages, then there will be 1000 Orchestration instances). Orchestrations linger until the last result message has arrived (or it can time out from a planned exception handling). There is an alternative way to aggregate, without Orchestration, but correlation is still necessary and required to collate all the results into the same batch. The trick is to correlate the result messages without Orchestration.

Let’s examine how correlation without Orchestration can be achieved.

Paolo Salvatori, in his first blog entry http://blogs.msdn.com/paolos/archive/2008/07/21/msdn-blogs.aspx describes how to correlate without Orchestration, by using a two-way receive port. This is important to note, because the two-way receive port provides key information that can be leveraged, by promoting them to the context property bag.

These key properties are

  • EpmRRCorrelationID
  • ReqRespTransmitPipelineID
  • IsRequestResponse
  • Correlation Token
  • RouteDirectToTP

Correlation without Orchestration is as easy as just promoting the right information to the context property bag. Of course, this information needs to be persisted, maintained, and returned with the result messages. However, this method only works with a two-way receive port. What about from a one-way receive port, such as from a file directory or MSMQ?

That is still possible because the de-batching (file splitting) mechanism of BizTalk Server provides a different set of compound key properties that can be used.

These are

  • Seq# (sequence number of child message)
  • UID (Identifier of parent message)
  • isLast? (Boolean whether if the child message is the last in the sequence)
 

The Sequence number of the child message, and the Boolean isLast can be used to determined the total number of records within the large message. For example, if there are 30 messages in a large message, the 30th message will have a sequence number of 30 (sequencing starts at 1), and isLast Boolean value of True.

 

 

The final step is to aggregate all the result messages and ensure that they’re correctly collated into the same batch that their original source messages from from. The UID that was used for correlation, is the parent identifier that can be used to group these result messages into. An external database table can be used to temporarily store the incoming messages, and a single SQL Stored Procedure can be used to extrapolate these messages into a single output file.

  • A result message is received by a FILE Receive Port
  • This result message is stored in a SQL Server table. This is achieved by being routed to a SQL Send Port where the original message is wrapped into a sort of envelope by a map which uses a custom XSLT. The outbound message is an Updategram which calls a stored procedure to insert the message into the table.
  • A SQL receive port calls a stored procedure to retrieve the records of a certain type from the custom table. This operation generates a single document.
  • This document is routed to a File Send Port where the message is eventually transformed into another format.

This method of using a SQL aggregator was communicated in http://geekswithblogs.net/asmith/archive/2005/06/06/42281.aspx

Note that the XSLT can be eventually replaced by a custom pipeline component or directly by an envelope (in this case the send pipeline needs to contain the Xml Assembler component).

Sample code is provided courtesy of Ji Young Lee, of Microsoft (South) Korea. Uncompress the ZIP file with the directory structure intact on the C:\ root directory (i.e.: c:\QuocProj). : http://code.msdn.microsoft.com/AsynchAggregation 

Acknowledgements to: Thiago Almeida (Datacom), Werner Van Huffel (Microsoft), Ji Young Lee (Microsoft), Paolo Salvatori (Microsoft)

Technorati Tags: BizTalk Server,Asynchronous,Aggregation,Large message processing,Split,De-batch,de-batching

ASP.NET Security Update Now Available

This morning Microsoft released a security update that addresses the ASP.NET Security Vulnerability that I’ve blogged about this past week.  We recommend installing it as soon as possible on your web-servers.

Common Questions/Answers

Below are some answers to a few common questions people have asked:

Do the updates require me to change any code?

No. The update should not require any code or configuration change to your existing ASP.NET applications.

Will I still need to use the workarounds after I install the update?

No. The update removes the need to use the security workarounds we’ve published this past week.  Those were temporary steps that could be taken to protect yourself before the update was released.  After you’ve installed the update you no longer need to use them. 

What is the impact of applying the update to a live web-server?

If you apply the update to a live web-server, there will be some period of time when the web-server will be offline (although an OS reboot should not be required). You’ll want to schedule and coordinate your updates appropriately.

Importantly – if your site or application is running across multiple web-servers in a web-farm, you’ll want to make sure the update is applied to all of the machines (and not just some of them). This is because the update changes the encryption/signing behavior of certain features in ASP.NET, and a mix of patched and un-patched servers will cause that encryption/signing behavior to be incompatible between them.  If you are using a web-farm topology, you might want to look at pulling half of the machines out of rotation, update them, and then swap the active and inactive machines (so that the updated machines are in rotation, and the non-updated ones are pulled from rotation and patched next) to avoid these mismatches.

Does this update work with SharePoint?

Yes.  We have not found any issues in testing SharePoint with this security update.  You should install it on SharePoint servers to ensure that they are not vulnerable.

Can I both install and uninstall the update?

Yes. The updates support install and uninstall scenarios.  Note that if you uninstall the update, though, it will leave your system unprotected.

Downloading the Updates

We are releasing the security update today via the Microsoft Download Center.  We will also release the update via Windows Update and the Windows Server Update Service in a few days as we complete final distribution testing via these channels. Once the update is on Windows Update, you can simply run Windows Update on your computer/server and Windows Update will automatically choose the right update to download/apply based on what you have installed.

If you download the updates directly from the Microsoft Download Center, then you need to manually select and download the appropriate updates.  Below is a table of all of the different update packages available via the Microsoft Download Center today. The downloads are split up by Windows Operating System (and corresponding service pack and processor architecture).  Each operating system version bucket below includes a listing of all available versions of .NET that are supported on it, and includes KB and download links to the appropriate security updates. 

Find your operating system within the below chart, then check to see which versions of .NET you have installed on it (details on how to determine which version of the .NET Framework is installed can be found here).  Download and apply the update packages for each version of .NET that you are using on that server.

Windows Server 2008 R2 and Windows 7

 

.NET Framework Version

KB Article

Patch

.NET Framework 3.5.1 (Default install)

KB2416471

Download

.NET Framework 4

KB2416472

Download

 

Windows Server 2008 SP2, Windows Vista SP2

 

.NET Framework Version

KB Article

Patch

.NET Framework 2.0 SP2 (default install)

KB2416470

Download

.NET Framework 4

KB2416472

Download

.NET Framework 3.5 SP1

KB2416470, KB2416473

Download, Download*

.NET Framework 3.5

KB2416470, KB2418240

Download, Download*

.NET Framework 1.1 SP1

KB2416447

Download

*When multiple patch downloads are listed above against a .NET version (for example with .NET 3.5 SP1 and .NET 3.5 installs) then all patches should be installed (order is not relevant).

Windows Server 2008, Windows Vista SP1

 

.NET Framework Version

KB Article

Patch

.NET Framework 2.0 SP1 (default install)

KB2416469

Download

.NET Framework 4

KB2416472

Download

.NET Framework 3.5 SP1

KB2416474, KB2416473

Download, Download*

.NET Framework 2.0 SP2

KB2416474

Download

.NET Framework 3.5

KB2416469, KB2418240

Download, Download*

.NET Framework 1.1 SP1

KB2416447

Download

*When multiple patch downloads are listed above against a .NET version (for example with .NET 3.5 SP1 and .NET 3.5 installs) then all patches should be installed (order is not relevant).

Windows Server 2003 SP2 32-bit

 

.NET Framework Version

KB Article

Patch

.NET Framework 1.1 SP1 (default install)

KB2416451

Download

.NET Framework 4

KB2416472

Download

.NET Framework 3.5 SP1

KB2418241, KB2416473

Download, Download*

.NET Framework 2.0 SP2

KB2418241

Download

.NET Framework 3.5

KB2416468, KB2418240

Download, Download*

*When multiple patch downloads are listed above against a .NET version (for example with .NET 3.5 SP1 and .NET 3.5 installs) then all patches should be installed (order is not relevant).

Windows Server 2003 64-bit

 

.NET Framework Version/SP

KB Article

Patch

Default OS Configuration

NA

NA

.NET Framework 4

KB2416472

Download

.NET Framework 3.5 SP1

KB2418241, KB2416473

Download, Download*

.NET Framework 2.0 SP2

KB2418241

Download

.NET Framework 3.5

KB2416468, KB2418240

Download, Download*

.NET Framework 1.1 SP1

KB2416447

Download

*When multiple patch downloads are listed above against a .NET version (for example with .NET 3.5 SP1 and .NET 3.5 installs) then all patches should be installed (order is not relevant).

Windows XP SP3 32-bit and 64-bit

 

.NET Framework Version/SP

KB Article

Patch

Default OS Configuration

NA

NA

.NET Framework 4

KB2416472

Download

.NET Framework 3.5 SP1

KB2418241, KB2416473

Download, Download*

.NET Framework 2.0 SP2

KB2418241

Download

.NET Framework 3.5

KB2416468, KB2418240

Download, Download*

.NET Framework 1.1 SP1

KB2416447

Download

*When multiple patch downloads are listed above against a .NET version (for example with .NET 3.5 SP1 and .NET 3.5 installs) then all patches should be installed (order is not relevant).

Summary

We recommend immediately applying the security update to your servers in order to protect your applications against attackers trying to exploit them.  We’d like to thank Juliano Rizzo and Thai Duong, who discovered that their previous research worked against ASP.NET, for not releasing their POET tool publicly before our update was ready.

You can ask questions and get help with the security vulnerability and update in a special ASP.NET Forum that we have setup here.  If you have problems or questions you can also contact Microsoft Customer Support for help (including support over the phone with a support engineer).  The official Microsoft Security Bulletin post is here.

Thanks,

Scott

Free e-book offer from Microsoft Press – Introducing Windows Server 2008 R2

Microsoft is offering a free e-book to partners. Learn about the features of Windows Server 2008 R2 in the areas of virtualization, management, the web application platform, scalability and reliability, and interoperability with Windows 7. Sign in to download Introducing Windows Server 2008 R2, written by industry experts Charlie Russel and Craig Zacker along with the Windows Server team at Microsoft.

Sign in to Microsoft Partner site to download the free e-book.

LOTS and LOTS of BizTalk Server 2010 Resources

This post is a follow-up to the RTM last week of BizTalk Server 2010. The BizTalk User Experience team has been REALLY busy putting together a ton of great resources. These are all live now, enjoy!

 

Learning:

 

Documentation:

 

Posters:

Versioning NET4 Workflow Services in Windows Server AppFabric

There is tremendous interest in versioning .NET4 Workflows.  This blog post provides guidance on common scenarios that can be supported.

 

Background

Workflow versioning is required by a change in a business process that causes a modifications in the implementation of a .NET4 Workflow Service and this may be change may need to be shielded from or  is not evident to the client.  In other words, existing clients (already deployed) can continue to communicate with the Service without any changes.  It is entirely up to the Server to decide whether instances of old and new implementations should exist side by side, whether existing instances should be upgraded to the new Definition, and how to route client requests to instances of different implementations.

 

When a new Workflow Definition is deployed, there is a choice on what to do with existing Instances of the old Definition(s). The first choice is to terminate existing Workflow Instances and the second one is to phase out the Instances. 

 

Terminating existing Workflow Instances is kind of extreme and this approach is only viable if the Workflow definition contains a catastrophic flaw or the business process is totally outdated therefore existing Instances cannot be allowed to continue. Phasing out Instances after they run their due course is more prevalent need since existing Workflow Instances remain unaffected and continue with their original definitions.  Over time all of these Instances complete and the old definition can be discarded. New instances will always be started with new definitions.

 

Before we proceed further lets recap on an the requirement that each Workflow Definition is hosted by a single Workflow Service Host (WSH).  And in our case, since Instances of multiple Definitions must run side-by-side, a corresponding number of WSHs must be created.  However these WSHs may share a single process or be divided between any numbers of processes depending on the load patterns.

 

This blog will focus on the ’Phasing out Instances’ approach. This approach requires a logical layer between the Client and Server (with the WSH) that will interrogate the Client message and route it to the appropriate Workflow Service Host that is hosting the Workflow Definition.

 

The new WCF Routing Service is used in this scenario. The Routing Service is placed in front of all (WSH) Hosts and routes incoming messages to the corresponding Hosts.  All Clients send messages to the endpoint exposed by the Routing Service.

Requirement

The requirements for this scenario are summarized as:

%u00b7         New instances should always launch the Instance with the latest version of the Workflow Definition.

%u00b7         Clients of in flight processes should be able to reach the right version of the Workflow Definition.

Assumption

Clients are agnostic of the Workflow versioning, and once deployed they do are or cannot be modified when a new Workflow version is deployed to the server.

’Layback Routing’ – Design Approach

The design approach, named in this blog as the ’Layback Routing’ is based off customized routing logic that processes correlation errors. While a correlation fails on latest version; the logic recursively retries down a list of older versions until the correlation is successful or all correlations are exhausted. At this point the correlation error is returned to client.

Caveat

The solution and design approach is suitable for long running Workflows and should be used in low/medium invocation scenarios. Bottom-line invocation performance should not be a constraint. For very higher volume processing other designs should be pursued. 

Solution

 

The Routing solution elaborated here is a simple customization of the WCF Routing Service.

 

The WCF Routing Service (System.ServiceModel.Routing) is a general purpose, programmable router that is a handy intermediary in a number of different situations, especially around managing communications between Clients and Services. The Routing Service is programmable via an API or by web.config. One of the very handy features of the Routing Service that Layback Routing utilizes is the backup list. This is a list of service file endpoints that will act as a backup for when a Service’s primary endpoint cannot be found or is unavailable.

 

A message that arrives from a Client causes a Persisted Instance to resume. This Instance is expecting to communicate with version 1 of the Service file, but when it tries to do so it reaches the newer version – 2 thereby causing a correlation fault to occur. While the Routing Service is not built to catch correlation faults, the custom endpoint behavior added to the Routing Service will catch this fault, and throw an EndpointNotFound exception. This type of exception will be caught by the WCF Routing Service, and will attempt to match this instance with one of the subsequent endpoints listed in this Service file’s backup list. All of the older versions of this Service file are contained in this list, so one of them should match with the Instance, and execution will then be able to resume.

Custom Behavior applied to Web.Config

The custom endpoint behavior mentioned here is a ClientMessageInspector. Implementing this custom behavior requires three files:

1.       BehaviorExtensionElement – Sets up the custom behavior, this is the class used from the web.config of the Routing Service.

2.       IEndpointBehavior – Manages the message channel binding parameters.

3.       IClientMessageInspector – Acts on the message.

 

This code below shows where to add the reference to this custom behavior to the Routing Service’s web.config file:

   1: <system.serviceModel>

   2: ...

   3: <behaviors>

   4: ...

   5: <endpointBehaviors>

   6: <behavior>

   7: <persistenceFaultToClientException />

   8: </behavior>

   9: </endpointBehaviors>

  10: </behaviors>

  11:  

  12: <extensions>

  13: <behaviorExtensions>

  14: <add name="persistenceFaultToClientException" type="ServiceExtensions.PersistenceFaultToClientExceptionBehaviorElement, ServiceExtensions, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null" />

  15: </behaviorExtensions>

  16: </extensions>

  17: ...

 

File ClientMessageInspector extends the behavior of the IClientMessageInspector. Function AfterReceiveReply inspects the message after it has been sent to its intended destination. If a fault has occurred and if this fault contains wording indicating that it is a correlation fault, we assume that this is due to a mismatched version and throw an EndpointNotFoundException. The Routing Service will then try this instance with the endpoints in the backup list and hopefully find a matching version. If in fact this correlation error is due to a more serious condition, the Routing Service will still attempt to connect the instance with the endpoints in the backup list, but they will all fail and the client will see the error message.

   1: using System.ServiceModel.Dispatcher;

   2: using System.ServiceModel;

   3:  

   4: namespace ServiceExtensions

   5: {

   6:     public class PersistenceFaultToClientExceptionInspector : IClientMessageInspector

   7:     {

   8:         public void AfterReceiveReply(ref System.ServiceModel.Channels.Message reply, object correlationState)

   9:         {

  10:             if (reply.IsFault)

  11:             {

  12:                 string faultText = reply.ToString();

  13:  

  14:                 if (faultText.Contains("contained incorrect correlation data"))

  15:                 {

  16:                     throw new EndpointNotFoundException("Correlation failed, try another service version.");

  17:                 }

  18:             }

  19:         }

  20:         public object BeforeSendRequest(ref System.ServiceModel.Channels.Message request, System.ServiceModel.IClientChannel channel)

  21:         {

  22:             return null;

  23:         }

  24:     }

  25: }

AppFabric Dashboard Behavior

If we were to use the behavior as is, the Dashboard in AppFabric would have an error indicating that the persistenceFaultToClientException element is not recognized. In order to avoid this error, we need to place a schema file in %windows%\system32\inetsrv\config\schema.

   1: schema file Service_Extensions_Schema.xml

   2: <configSchema>

   3:     <sectionSchema name="system.serviceModel/behaviors">

   4:     <element name="endpointBehaviors">

   5:       <collection addElement="behavior" removeElement="remove" clearElement="clear" allowDuplicates="true">

   6:         <element name="persistenceFaultToClientException"/>

   7:       </collection>

   8:     </element>

   9:   </sectionSchema>

  10: </configSchema>

With the ClientMessageInspector in place, we can rely on the Routing Service to route messages to older versions of the service file when needed. Because this versioning and routing work takes place on the server after each publish, we have a custom provider on the server’s instance of Web Deploy that will accomplish this task. 

 

So, with the project published to the Server and the enabled protocols set – the Client application calls the custom provider in order to perform the Service file versioning and routing configuration updates. The source and destination base options both point to the IIS Server. The provider only needs to know the website and application name for these updates. This is set in the destination’s DeploymentProviderOptions.Path variable. The custom provider we use is named versionedPublish.

   1: DeploymentProviderOptions sourceProviderOptions = new DeploymentProviderOptions("versionedPublish");

   2:  

   3: providerOptions.Path = "";

   4: DeploymentObject deploymentObject = DeploymentManager.CreateObject(

   5: sourceProviderOptions,

   6: destinationBaseOptions);

   7:  

   8: DeploymentProviderOptions destinationOptions = new DeploymentProviderOptions("versionedPublish");

   9:  

  10: destinationOptions.Path = "Default Web Site/WorkflowApplication";

  11:  

  12: deploymentObject.SyncTo(destinationOptions,

  13: destinationBaseOptions, syncOptions);

Creating a Web Deploy Custom Provider

This Custom Provider’s assembly (dll) resides in the Extensibility folder of Web Deploy on the IIS Server. The client must also have this Custom Provider dll in its own Web Deploy Extensibilities folder, otherwise Web Deploy won’t recognize the Provider being called and return an error instead.

 

Note that you need to create the Extensibilities folder as it is not created upon installation of Web Deploy (location: %program files%\IIS\Microsoft Web Deploy\Extensibilities).

 

To create a Custom Provider, we use a class library project in Visual Studio 2010 and build it with a target framework of .NET 3.5. Web Deploy won’t yet recognize a provider written in .NET 4.0. Also, the project must reference Microsoft.Web.Deployment.dll and Microsoft.Web.Delegation.dll, both found in the Web Deploy folder. A Custom Provider consists of two files, a DeploymentObjectProvider and a DeploymentProviderFactory.

 

The DeploymentProviderFactory provides an interface for the Web Deployment Agent to interact with the Custom Provider. Note that both the Name and FriendlyName functions return the name of the Custom Provider.

   1: File VerisonedPublishProviderFactory.cs

   2: ...

   3: using Microsoft.Web.Deployment; 

   4: namespace Providers.WebDeployUtilities

   5: {

   6:     [DeploymentProviderFactory]

   7:     public class VersionedPublishProviderFactory : DeploymentProviderFactory

   8:     {

   9: protected override DeploymentObjectProvider Create(DeploymentProviderContext providerContext, DeploymentBaseContext baseContext)

  10:        {

  11:             return new VersionedPublishProvider(providerContext, baseContext);

  12:        }

  13:  

  14:         public override string Description

  15:         {

  16:             get { return @"Custom provider for versioning published files."; }

  17:         }

  18:  

  19:         public override string ExamplePath

  20:         {

  21:             get { return @"Destination Web Site/ApplicationName"; }

  22:         }

  23:  

  24:         public override string FriendlyName

  25:         {

  26:             get { return "versionedPublish"; }

  27:         }

  28:         public override string Name

  29:         {

  30:             get { return "versionedPublish"; }

  31:         }

  32:     }

  33: }

Next up is the Custom Provider. Just after the class declaration are two lines that define the name of the custom provider as well as the name of the provider’s key attribute. Function GetAttributes is used to determine whether the provider is looking at the source or the destination.  When the provider is at the destination, a DeploymentException is thrown which ends up calling the Add function.

   1: File VersionedPublishProvider.cs

   2:  

   3: using Microsoft.Web.Deployment;

   4: using System.IO;

   5: using System.Diagnostics;

   6: using System.Text.RegularExpressions;

   7: using System.Xml.Linq;

   8: using Microsoft.Web.Administration;

   9: using System.Web.Routing;

  10: using System.Net;

  11: using System.Web;

  12:  

  13: namespace Providers.WebDeployUtilities

  14: {

  15:     public class VersionedPublishProvider : DeploymentObjectProvider

  16:     {

  17:  

  18:         internal const string ObjectName = "versionedPublish";

  19:  internal const string KeyAttributeName = "path";

  20:  

  21:  

  22:         public VersionedPublishProvider(DeploymentProviderContext providerContext, DeploymentBaseContext baseContext)

  23:             : base(providerContext, baseContext)

  24:         {

  25:             this.FilePath = providerContext.Path;

  26:         }

  27:  

  28:  

  29:         protected internal string FilePath

  30:         { get; set; }

  31:  

  32:  

  33:         #region DeploymentObjectProvider members

  34:         public override void GetAttributes(DeploymentAddAttributeContext addContext)

  35:         {

  36:             if (this.BaseContext.IsDestinationObject)

  37:             {

  38:                 throw new DeploymentException();

  39:             }

  40:             else

  41:             {

  42:  

  43:             }

  44:  

  45:             base.GetAttributes(addContext);

  46:         }

  47:  

  48:  

  49:         public override DeploymentObjectAttributeData CreateKeyAttributeData()

  50:         {

  51:             DeploymentObjectAttributeData attributeData = new DeploymentObjectAttributeData(

  52:                 VersionedPublishProvider.KeyAttributeName,

  53:                 this.FilePath,

  54:                 DeploymentObjectAttributeKind.CaseInsensitiveCompare);

  55:  

  56:             return attributeData;

  57:         }

  58:  

  59:  

The Add function (below) is called when it is time to act on the destination.  The purpose of our Custom Provider here is to create new versions of the service files and modify the Routing configuration as described above. The following code is a summary of the functions involved. A complete version of this code is available for download along with this blog posting.

 

The first step in the process is to take the data passed in by destinationProviderOptions.Path when this Custom Provider is called. From inside this Provider, this data is in variable this.FilePath. We separate this string into website and application and go to work. For your reference, the source’s DeploymentProviderOptions.Path value would be accessed with source.ProviderContext.Path.

 

Next, function findAppLocation uses ServerManager (Microsoft.Web.Administration) to get the physical location of the folder for this IIS Application. With this location handy, findFilesNoVersions discovers all of the service files in this application. For each service file, a versioned copy is created and ServerManager is used to update Layback Routing’s configuration file.

   1: public override void Add(DeploymentObject source, bool whatIf)

   2:         {

   3:             if (!whatIf)

   4:             {

   5:           

   6:                 string siteNameAppName = this.FilePath;

   7:                 string[] siteAndAppArray = siteNameAppName.Split(new string[] { "/" }, StringSplitOptions.RemoveEmptyEntries);

   8:                 string siteName = siteAndAppArray[0];

   9:                 string appName = siteAndAppArray[1];                

  10:  

  11:   string correctAppPath = findAppLocation(siteName, appName);

  12:                 List<string> fileNamesNoVersions = new List<string>();

  13:                 string versionKey = "_wfs";

  14: 

  15: 

  16:                 findFilesNoVersions(correctAppPath, fileNamesNoVersions);

  17:  

  18:  

  19:                 foreach (string fileNameNoVersion in fileNamesNoVersions)

  20:                 {

  21:                     string justName;

  22:                     int highVer;

  23:                     determineNextVersion(correctAppPath, fileNameNoVersion, out justName, out highVer);

  24:                     createVersionedCopy(correctAppPath, versionKey, justName, highVer);

  25:  

  26:  

  27:                     string routerName = "LaybackRouting";

  28:                     using (ServerManager serverManager2 = new ServerManager())

  29:                     {

  30:                         string appNameJustName = appName + justName;

  31:                         appNameJustName = appNameJustName.Replace(' ', '_');

  32:                         string appNameJustNameAddress = HttpUtility.UrlPathEncode(appName) + "/" + HttpUtility.UrlPathEncode(justName);

  33:                         routerName = HttpUtility.UrlEncode(routerName);

  34:                         Configuration routerConfig = serverManager2.GetWebConfiguration(siteName, "/" + routerName);

  35:                         ConfigurationElementCollection clientCollection = addClientEndpointWithoutVersion(appNameJustName, appNameJustNameAddress, routerConfig);

  36:                         addClientEndpointWithVersion(versionKey, highVer, appNameJustName, appNameJustNameAddress, clientCollection);

  37:                         ConfigurationSection routingSection;

  38:                         ConfigurationElementCollection backupListsCollection;

  39:                         ConfigurationElement listToCreate;

  40:                         findBackupList(appNameJustName, routerConfig, out routingSection, out backupListsCollection, out listToCreate);

  41:                         if (listToCreate != null)

  42:                         {

  43:                             addBackupListEndpoint(versionKey, highVer, appNameJustName, listToCreate);

  44:                             findAndEditLaybackTable(versionKey, highVer, appNameJustName, routingSection);

  45:                         }

  46:                         else

  47:                         {

  48:                             makeBackupListAddEndpoint(versionKey, highVer, appNameJustName, backupListsCollection);

  49:                             findAndEditLaybackTable(versionKey, highVer, appNameJustName, routingSection);

  50:                             addEndpointAddressFilter(routerName, appNameJustName, appNameJustNameAddress, routingSection);

  51:                         }

  52:                         serverManager2.CommitChanges();

  53:                     }

  54:                 }

  55:   

  56:             }

  57:         }

Limitations

There is a performance overhead incurred in the Layback Routing process of attempting to successfully connect a message with the appropriate Definition version. Each request for an older version takes Processing/CPU resources to check the Persistence Store for matching correlation keys, and to return an error to the Routing Service (it’s local, so no additional network bandwidth is consumed).

 

It is possible to optimize this scenario further. The WCF Routing Service’s use of the backup list happens on every request for an old version; caching the resulting mapping between correlation keys and endpoints, possibly using AppFabric Caching, would help optimize performance and resource utilization. 

Namaste!

Windows Server AppFabric Monitoring – How to create operational analytics reports with AppFabric Monitoring and Excel PowerPivot

One of the great (and major) features of Windows Server AppFabric is the out-of-the-box experience for monitoring WCF and WF based services. Gone are the days when an exception half way through a workflow instance could either 1) never be discovered or 2) take hours of creative efforts and ad-hoc debugging (not to mention the hair pulling that goes along with all this) to find out exactly what the issue was and why it happened. The AppFabric Dashboard does a pretty good job at exposing key execution statistics to the Administrator. But is it enough?

Since AppFabric went public with a Beta release back in early March I’ve been hearing from customers that although the AppFabric Dashboard is great to look at mostly the current state of the farm, it does lack some of the analytical, trend-focused aspects of monitoring a real-life environment. So, this article will explain what is involved in creating interactive analytical reports (or dashboards if you will) based on the monitoring data captured by AppFabric, and our old friend – Microsoft Excel…well, with its new addition, PowerPivot.

Now, is this going to give you the exact implementation – definitely not; the article is intended to present the solution design and shows a sample report to demonstrate the capabilities. In your implementation you may choose to report on different metrics, filter or aggregate the data differently, provide different pivots, or completely modify the layout to meet your specific requirements.

With that said, the key features of the solution presented in this article include:

  • Visual presentation of the historical data stored in the AppFabric monitoring store
  • Trend visualization and analysis of the WCF services load and call latency (average duration)
  • Trend visualization and analysis of the WF services load, long-running state, and efficiency (end-to-end service lifetime)
  • Error filtering and error trend analysis
  • Interactive slicing and dicing per service, WCF service call result, and WF state
  • Time-based analysis – granular down to the minute, or as coarse as yearly.

Surely the above is a compelling value prop to any IT pro!

Reporting Requirements

For the purpose of this article, let’s define some basic operational requirements that demonstrate the solution approach – the report should provide analytical capabilities on top of the service tracking data captured by AppFabric, as follows:

  • Provide interactive slicing and dicing of call statistics (number of calls and response time) over time, and by service and operation
  • Report on success/failure statistics

Solution Design

At a high level the design of the presented solution is simple and is depicted in the following diagram:

In summary, we create a star schema-like PowerPivot model based on the data exposed by the public AppFabric monitoring views, and then create a spreadsheet with a number of PowerPivot-based charts that visualize the AppFabric monitoring statistics. The PowerPivot analytical features allow the user to interactively and very flexibly work with the large amount of data available in the AppFabric monitoring store. Finally, the Excel workbook can be published to a PowerPivot-enabled SharePoint installation, for sharing with team members and business stakeholders.

The Implementation

A few words on the AppFabric monitoring subsystem

Before we jump into connecting PowerPivot and Excel to the AppFabric monitoring store, let’s first cover some of the basics that will hopefully help us build a robust and supported solution. J

The AppFabric Monitoring API

In the AppFabric world, it’s called database views – instead of exposing a programming API, AppFabric takes a light-weight approach of providing a number of public database views sitting on top of the physical tables. These views ensure a layer of abstraction from the physical data model, designed for direct consumption by the AppFabric Dashboard itself as well as any other application that needs to query the monitoring data. Here is a list of the key public views that we will use in this article:

  • ASEventSources – This view stores metadata for the events to identify the source of events such a service name, site and virtual path
  • ASWcfEvents – This view is created on all the analytic tracing events that are emitted when a WCF service is invoked
  • ASWfEvents – This view is created on the tracking events emitted for WF instances
  • ASWfInstancesThis view is created on the active WF instances. There is one row per WF instance

The following entity model depicts the logical relationship between these DB views:

So, the interesting points here are:

  • All records returned by the views refer to the source (service) that generated the event using the EventSourceId field
  • The ASWfInstances view has a logical master-child relationship with the ASWfEvents view. So, we expect that a record from the ASWfInstances view would have a number of records in the ASWfEvents view, each representing an event generated from the execution of an activity inside the corresponding workflow instance.

WCF Event Types

The WCF runtime emits ETW tracking events for different types of execution events, such as a call to a WCF service, a call failure, WCF throttling activation, etc. These event types have IDs associated with them. A full reference of the different ETW event types applicable to WCF is available here. Later in the article, we will use the EventTypeId column from the ASWcfEvents view to filter out the events that don’t make sense (or are too detailed) for our solution.

AppFabric WCF tracked events aggregation

In general, WCF calls are stand-alone units of work – the client calls a service operation, the operation performs some work (in code) and then completes. A single tracking event is sufficient to fully describe the outcome from the operation – success or error (along with other relevant information, such as exception details for example). This relative simplicity of tracking WCF service calls offered an opportunity to optimize the WCF monitoring logic in AppFabric by introducing an event aggregation feature. Based on a pre-defined sampling interval with a default value of 5 seconds, data from all WCF tracked events during the sample interval gets aggregated into a single tracking record by operation. For example, if in the last 5 seconds the GetCreditScore() operation was called 30 times, and GetUserDetails() was called 20 times, we will only see 2 tracked events in the ASWcfEvents view – one for GetCreditScore() and one for GetUserDetails(). The information in these two records will have aggregated statistics for the AggregateCount, AverageDuration, and MaxDuration columns, calculated from the multiple executions of the two operations over the sample interval. Of course, the AggregateCount value will have 30 and 20 respectively, while the other two metrics will be based on the performance of the service operations.

Why is this important? When we start looking at aggregating the monitoring data at a higher level (in our T-SQL queries and then in Excel), accurately calculating the double-aggregated data may not be trivial. So, before we start crafting our analytical report, we may want to first disable the native AppFabric aggregation of WCF events using the EventCollection service configuration. The steps for this task are:

  1. Open the root web.config file located in C:\Windows\Microsoft.NET\Framework\v4.0.30319\Config for the x86 platform, or C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Config for x64
  2. Search for the following section:

    <collectors>
        <collector name=“” session=0>

            <settings retryCount=5 eventBufferSize=10000 retryWait=00:00:15 samplingInterval=00:00:05 aggregationEnabled=true />

        </collector>

    </collectors>

  1. Update the aggregationEnabled attribute to false
  2. Save the file and restart the EventCollection service using the Services console

By now I’m sure you’re wondering what the performance impact is of having aggregation turned off. It is not that significant on the runtime, and in the range of 2-3% loss in throughput. However, if you are processing more than 3,000-4,000 WCF calls per second, the per-call tracking approach (with aggregation off) may lead to bottlenecks in the monitoring data staging job earlier than otherwise. Luckily, the techniques to mitigate these bottlenecks are also described in detail in my previous blog article.

Building the Report – Step by Step

The first step towards our goal is to write the T-SQL statements that will get us the right data using the AppFabric Monitoring views. I am using four queries that we will later copy/paste into PowerPivot. Let’s however spend a few moments on the T-SQL itself.

The “Dates” Query

The first query will give us all unique date/time values that have associated WCF and/or WF events. These values will be used to build our “time dimension” in PowerPivot. Note that in order to limit the number of distinct dates and times, we will trim the values to the minute. So, for a 24 hour period we will have a maximum of 1,440 values, if there was constant activity in the environment. Here is the query:

SELECT

    DISTINCT

    LEFT(CONVERT(varchar, DATEADD(minute,DATEDIFF(minute,GETUTCDATE(),GETDATE()),E.TimeCreated) , 20), 16) as Date

FROM

    ASWcfEvents E

WHERE

    EventTypeId IN (214, 222, 223)

UNION

SELECT

    LEFT(CONVERT(varchar, DATEADD(minute,DATEDIFF(minute,GETUTCDATE(),GETDATE()),E.TimeCreated) , 20), 16) as Date

FROM

    ASWfEvents E

    JOIN

        (SELECT MAX(WfE.Id) as Id

         FROM ASWfEvents WfE

         WHERE ActivityName IS NOT NULL

         GROUP BY WfE.WorkflowInstanceId) as T

     ON (T.Id = E.Id)

 

Interesting points about the query:

  • It selects distinct values (see highlighted above)
  • For WCF events, for the purpose of this article we are only interested in event 214 (Operation completed), 222 (Operation Failed) and 223 (Operation Faulted). This will give us the dates for all successful service call completions and errors
  • We use a UNION to then merge (still under the DISTINCT clause) WF events generated by the latest activity for each workflow instance – this is achieved by selecting the MAX Id record for each workflow instance, in the JOIN part of the query
  • Since AppFabric stores the data in UTC format, we need to calculate the local data and time, for display purposes in Excel/PowerPivot. I used this formula to get the right result:

    DATEADD(minute,DATEDIFF(minute,GETUTCDATE(),GETDATE()),E.TimeCreated)

  • The date values are trimmed to the minute, so we will have the year, month, day, hour and minute data.

The “Event Sources” Query

The next query will be used to get a list of all data sources (services) that generated event records into the AppFabric monitoring store. Here is the query:

SELECT

ES.Id,

ES.Name

FROM

ASEventSources ES

WHERE

ES.Name IS NOT NULL

Very simple so I’ll just skip to the next one J

The “WCF Events” Query

This query takes care of returning WCF service call data to PowerPoint. The T-SQL for this task is as follows:

SELECT

E.EventSourceId,

E.OperationName,

CASE

WHEN E.EventTypeId = 214 THEN ‘Success’

WHEN E.EventTypeId IN (222, 223) THEN ‘Error’

END as EventType,

LEFT(CONVERT(varchar, DATEADD(minute,DATEDIFF(minute,GETUTCDATE(),GETDATE()),E.TimeCreated) , 20), 16) as CallDate,

E.Duration / 1000.0 as Duration,

1 as Count

FROM

ASWcfEvents E

WHERE EventTypeId IN (214, 222, 223)

ORDER BY CallDate ASC

Again, some interesting facts about this query:

  • We are only querying Operation Completed/Failed/Faulted events – 214, 222 and 223 respectively. Event 214 gets mapped to “Success”, while event 222 and 223 both get mapped to an “Error” string
  • Date is trimmed to the minute, so that it can be later joined (in PowerPivot) to the Date column from the first query

The “WF Events” Query

The last of the four queries will be providing us with WF events data, based on the ASWfEvents and ASWfInstances views:

SELECT

    WfI.LastEventSourceId as EventSourceId,

    LEFT(CONVERT(varchar, DATEADD(minute,DATEDIFF(minute,GETUTCDATE(),GETDATE()),WfI.StartTime) , 20), 16) as InstanceStartTime,

    CASE

        WHEN WfI.LastEventStatus IN (‘Completed’, ‘Terminated’, ‘Canceled’, ‘Aborted’) THEN WfI.CurrentDuration

        ELSE DATEDIFF(second, DATEADD(minute,DATEDIFF(minute,GETUTCDATE(),GETDATE()),WfI.StartTime), GETDATE())

    END as CurrentDuration,

    WfI.LastEventStatus,

    WfE1.ActivityName,

    COALESCE(WfE1.State, WfI.LastEventStatus) as State,

    LEFT(CONVERT(varchar, DATEADD(minute,DATEDIFF(minute,GETUTCDATE(),GETDATE()),WfE1.TimeCreated) , 20), 16) as LastActivityTime

FROM

    ASWfEvents WfE1

    JOIN

        (SELECT MAX(WfE.Id) as Id

         FROM ASWfEvents WfE

         WHERE ActivityName IS NOT NULL

         GROUP BY WfE.WorkflowInstanceId) as T

     ON (T.Id = WfE1.Id)

    JOIN ASWfInstances WfI

     ON WfI.WorkflowInstanceId = WfE1.WorkflowInstanceId

This is the most complex of the four queries. J The only notable point here is that we are using the same technique as in the first query – a sub-query gets the event record for the latest activity from each workflow instance, then joins to the same source view (ASWfEvents), and also to the ASWfInstances view in order to get the current state of the instance.

Setting up the PowerPivot tables

PowerPivot shipped with the release of Excel 2010. So, for the next steps you must have Office 2010 (ha, who doesn’t anyway…J). If your machine meets this requirement, when you start Excel you will notice a new ribbon called PowerPivot:

Click on the PowerPivot Window button – the first button on the left hand side. This will open the PowerPivot for Excel window. We will need to create a new data source, pointing to the AppFabric monitoring store. From the ribbon, select the “From Database -> From SQL Server” button.

 

In the Table Import wizard window that will open, provide the details to connect to the AppFabric monitoring store, and then click Next. Select the option to write a query:

For the friendly name of the query, type in “Dates”, and then copy/paste the “Dates” query from earlier in the blog and click the Validate button. The window should look like this:

Click Finish. For increased reporting flexibility, I like to break date/time fields down to individual components – year, month, date, hour, and minute. To do this, add a new calculated column to the Dates table, with the formula set to “=YEAR([Date])“. Repeat the same for the remaining four columns (month, day, hour, and minute), this time using the MONTH(), DAY(), HOUR(), and MINUTE() functions respectively. The window should looks like this:

We now have one of the PowerPivot tables that we need – the “Dates” table.

Next, in a very similar way, we will have to create PowerPivot tables for the remaining three queries. I won’t cover this in detail as the process is almost identical to what we just did. I will give you a hint though – for all subsequent tables, use the “Existing Connections” button from the “Design” ribbon, and reuse the data source that we defined a bit earlier in the article:

Also, when you create the other PowerPivot tables, you can use the same naming as in my sample:

I am pretty sure that you will figure out which table corresponds to which query. J We are almost done with the creation of the PowerPivot tables. The last thing we need to do is create some relationships between the tables – think a star schema. You can use the “Create Relationship” button from the “Design” ribbon. The details for the relationships are as follows:

  • WcfEvents table: Relationship to Dates (as a lookup table), based on the CallDate and Date columns respectively
  • WcfEvents table: Relationship to EventSources (as a lookup table), based on the EventSourceId and Id columns respectively
  • WfEvents table: Relationship to Dates (as a lookup table), based on the LastActivityTime and Date columns respectively
  • WfEvents table: Relationship to EventSources (as a lookup table), based on the EventSourceId and Id columns respectively

After you setup the relationships, if you click on the “Manage Relationships” button from the ribbon, the “Manage Relationships” window should look like this:

BTW, I think I got carried away with all these instructions and forgot to say that by now you should have saved your spreadsheet a few times already! But you already knew and did that, right? J

The Pivot Charts report

I won’t even dare giving step-by-step instructions for configuring pivot tables and pivot charts – if I went into this, the length of the article would become unbearable. Besides, this topic is covered in the Excel documentation. Instead, I’ll give a few hints and pointers specific to PowerPivot-based charts and tables:

To insert a new pivot chart or table based on PowerPivot, in the PowerPivot for Excel window, click on the “Pivot Chart/Table” button from the “Home” ribbon:

Once you have an empty pivot table or chart in your Excel sheet, PowerPivot-based objects act very similar to standard Excel pivot tables and charts. One thing worthwhile mentioning though, is that PowerPivot has the concept of Slicers. Slicers allow you to interactively and visually filter, or slice the data based on your needs. In my sample, I use slicers for the “top-level” filters – services, WCF success/error status, and WF instance state. Here is what Slicers look like:

(I’m not great at circling things on the screen with a mouse, am I?)

When you select a PowerPivot-based chart or table, slicers can be configured similar to the standard legend, axis, and values settings – via the Pivot table/chart field list and properties pane. Just drag fields from your data model onto the Slicers areas:

One more thing on Slicers – you can configure a single slicer to filter multiple pivot charts or tables. To achieve this, select the Slicer, right-click on it, and then click on the PivotTable Connections menu:

The rest is not difficult to figure out – you use the checkboxes representing your pivot tables and charts to choose which ones you want the Slicer to interact with. This feature comes useful when you want to slice and dice two or more charts together – for example, if you have one chart showing the number/count of calls over time, and another chart showing the average call duration over time, having a single slicer interacting with both charts will provide a “correlated” view between the two charts; filtering by service using a slicer, will give you both the number of calls (chart 1) and the average duration (chart 2) for just this service.

The Result

So, after all this, here is a screenshot of the sample report I created:

Once again, I want to highlight the key features:

  • Visual presentation of the historical data stored in the AppFabric monitoring store
  • Trend visualization and analysis of the WCF services load and call latency (average duration)
  • Trend visualization and analysis of the WF services load, long-running state, and efficiency (end-to-end service lifetime)
  • Error filtering and analysis
  • Interactive slicing and dicing per service, WCF service call result, and WF state
  • Time-based analysis – granular down to the minute, or as coarse as yearly.

The last piece in the puzzle, which is out of scope for this article, is to publish the PowerPivot spreadsheet to a SharePoint site, to make it available to all interested parties (secured appropriately through SharePoint, of course). Instructions on publishing a PowerPivot workbook to SharePoint are available here.

On a somewhat related topic, I should probably also mention that PowerPivot, when used directly within Excel, does not provide any data auto-refresh features such as a background refresh or refresh on open – data updates must be performed manually. Once uploaded to a SharePoint site however, the PowerPivot data connection can be configured to auto-refresh. Scheduling an auto-refresh within SharePoint is described in detail here.

Conclusion

The AppFabric monitoring store, combined with the flexibility, scale, and analytical strengths of PowerPivot, provide a solid foundation for operational reporting and analysis. Understanding the trends in the load and usage pattern of a system has proven critical for early mitigation and successful prevention of costly downtimes. For long-running workflow-based services, identifying the steps in the process that take the longest to execute, can also easily lead to significant optimizations and improved efficiency of the business as a whole.

BizTalk 2010 Released with some goodies.

BizTalk 2010 hit the stands this week and quite prominently up on the BizTalk 2010
site there’s info about vNext.

The team has been busy and The BizTalk 2010 Developer Edition is free!http://www.microsoft.com/biztalk/en/us/developer.aspx

Lots of info up on the site – What’s New http://www.microsoft.com/biztalk/en/us/whats-new.aspx

  • During this What’s new, you’ll see that there is ’enhanced Trading
    Partner Management’ which typically gets flagged under EDI based solutions. In a later
    post I’ll show you how to work with Trading Partners from any solution, and the bit
    that has me excited is that we now can store an arbitrary set of name/value
    pairs against each Trading Partner
    (and their individual agreements).

 

Initial Training – BizTalk 2010 Training Kit – http://www.microsoft.com/downloads/en/details.aspx?FamilyID=35c8fb51-a1e3-496e-841a-b48701a80c40


The BizTalk Server 2010 training kit includes labs and training videos to help you
learn about the new features of BizTalk Server 2010.

This training kit contains the following content:
Hands On Labs

  • Creating BizTalk Maps with the new Mapper

  • Consuming a WCF Service

  • Publishing Schemas and Orchestrations as WCF Services

  • Integrating with Microsoft SQL Server

  • Integrating using the FTP Adapter

  • Developers – Create a Role and Party-based Integration Solution

  • Exploring the New Settings Dashboard

  • Monitoring BizTalk Operations using System Center Operations Manager 2007 R2

  • Administrators – Create a Role and Party-based Integration Solution

 

Enjoy and stay tuned for the integration unraveling in the near future 🙂