Removing XML elements from an input document (with large message support)

In my last post I talked about taking an XML message and stripping out elements.  In that post I used a standard MemoryStream object and a reader requested that I also talk about how this pipeline could be created to deal with streaming of large messages.  Lets take a look at what those modifications would look like.


 


When dealing with large incoming messages you don’t always want to load the message into memory – which is typically what the MemoryStream object does.  BizTalk has included a set of objects in the Microsoft.BizTalk.Stream.dll (located in the GAC) that provide a means of handling large messages.  Within this assembly is the VirtualStream class (the source code can be found in the SDK under the \Program Files\Microsoft BizTalk Server 2006\SDK\Samples\Pipelines\ArbitraryXPathPropertyHandler directory in the VirtualStream.cs file).  This objects behavior provides the means to keep the stream data in memory up to 4MB (this is the default threshold) and once that threshold is reached the additional data is written to a temporary file on the hard drive.


 


If you decide that this functionality is what you are looking for then you need to keep a couple of things in mind.  Since the BizTalk service will be writing a file to the hard drive you need to make sure that the BizTalk account has the security privileges to write to the %temp% folder (this will be the %temp% folder on each of the BizTalk servers for each of the host instances).  Also, by default, the %temp% folder is placed on the C:\ drive.  You need to decide if that is where you want the temp file to be created.  You most likely will want to configure your server to put it on a different drive since the C:\ is typically configured with a small amount of storage space.  You will also want to make sure that the location of the new %temp% directory is not getting backed up.   


 


So, the first thing is that we need is to set a reference to the Microsoft.BizTalk.Streaming.dll.  This is located in the GAC and therefore you will not be able to add the reference within Visual Studio to this location.  You will need to copy the .dll from the GAC and place a copy of it somewhere else on your hard drive and add the reference to that copy.  You can do this through a command prompt and navigate to its location (on my machine it was C:\WINDOWS\assembly\GAC_MSIL\Microsoft.BizTalk.Streaming\3.0.1.0__…….). 


 


Instead of using a MemoryStream we will use the VirtualStream object.  When we new up our object we are presented with 6 overloads.  We want to use the overload that allows us to pass in the VirtualStream.MemoryFlag.  The options of this enum are OnlyToDisk, OnlyInMemory and AutoOverFlowToDisk.  We want to use the AutoOverFlowToDisk.  If you want to change the memory threshold then use the following overload; VirtualSream(int bufferSize, VirtualStream.MemoryFlag).


 


So, the Execute method now looks like this:


 


        public … Execute(.)


        {


            try


            {


                IBaseMessagePart bodyPart = inmsg.BodyPart;


                VirtualStream vs = new VirtualStream(VirtualStream.MemoryFlag.AutoOverFlowToDisk);


 


                if (bodyPart != null)


                {


                    Stream originalStream = bodyPart.GetOriginalDataStream();


 


                    if (originalStream != null)


                    {


                        XmlTextReader Xtr = new XmlTextReader(originalStream);


 


                        XmlTextWriter Xtw = new XmlTextWriter(vs, Encoding.UTF8);


   ..


   ..


   ..


 


             }


               


                vs.Position = 0;


                bodyPart.Data = vs;



                pc.ResourceTracker.AddResource(vs);


                return inmsg;


            }


 


The items in bold outline the lines that changed when we used the VirtualStream object.

BizTalk Environment Migration Checklist

My company’s “standard operating procedure” for BizTalk Server doesn’t call out the specific requirements to deploy among environments (development to test, test to production, etc), so I’m trying to help the team get those articulated.  Here’s my first stab at a checklist that should be followed for BizTalk application migration between environments.   I don’t want […]

BizTalk 2006 R2 Branch Edition – What’s the go?

I get alot of questions about this – for those of you who have been mentally scarred
with early editions of BizTalk (haven’t we come along way since then 🙂 there was
a BizTalk Partner Edition (lower priced for hub/spoke type implementations)
which was limited in some way with the number of Orchestrations/Tradining Partners
etc etc. (like 10 Orchs, 2 partners from memory).

With BizTalk 2006 R2 we have the Branch Edition which
retails for approx USD$1500 and it gives you……..

BizTalk Business Rules Engine (alot of people are wanting to use
the BRE as a centralised rule store in a cost effective manner, until now it was BizTalk
Standard at least that you needed to get, as the BRE is not available separately)

BizTalk RFID (what can I say here!!!)

A perfect application of the Branch Edition is to drop this in on
your trading partner’s site typically meaning less time to get up and running (for
the price, if consultants are spending more than 1.5 days trying to establish communications
with the other end, then you should be considering the Branch Edition as
it understands all the classic forms of comms with BizTalk ‘proper’. By
no means is it limited to just BTS)

I thought I’d also give you the more formal description of what the Branch
Edition has/has not under the hood:

———————————————

BizTalk 2006 R2
Branch Edition

BizTalk Branch Edition is a specialty version of BizTalk Server designed for hub and
spoke deployment scenarios including RFID.

Scenarios:

1.     Hub-Spoke
Deployment. In this scenario the Branch edition is located in the regional / or point
of sale locations and communicate with the hub (BizTalk Enterprise Edition).

2.     RFID
Deployment. In this scenario the BizTalk Edition applies rules and business process
to the raw data and communicates with the hub (BizTalk Enterprise Edition) to send
aggregated business data.

3.     Standalone
Deployment. In this scenario the Branch Edition is used to execute a business process,
execute rules on the business data but not communicate with any central or hub.

Supported Capabilities:

1.     General
Transport Adapters like FILE, HTTP, HTTPS, MSMQ, FTP, SMTP, POP3 are available

2.     RFID
Manager and RFID Adapter

3.     Host
Integration Adapter

4.     Remote
or local SQL Server database is supported. SQL Server Databases may be installed on
a failover Windows Cluster providing high availability of the BizTalk Databases.

5.     BizTalk
base capabilities like Messaging, Orchestration, BRE, BAM, Management & Operations
and Development Tools are available

Limitations:

1.     No
Line of Business Adapters are available.

2.     No
Accelerators are available.

3.     Only
one BizTalk Application can be deployed.

4.     BizTalk
Server Group supports only one BizTalk Server. This means there is no fault tolerance,
no scale-out, and no failover clustering

5.     A
maximum of 2 – Processors are supported

6.     No
Virtual Processor is supported. This means the dual core is not leveraged in dual
core processors

7.     Two
or more Branch Editions separately deployed in different locations cannot communicate
with each other

References:

1.     http://www.microsoft.com/biztalk/editions/default.mspx



 

Having problems undeploying that pesky BizTalk Assembly?

Have you ever received the following error message when trying to undeploy a BizTalk assembly?

“Some items in the removed assembly are still being used by items not defined in the same assembly, thus removal of the assembly failed.”

Make sure that items in the assembly you are trying to remove fulfill the following conditions:
1. Pipelines, maps, and schemas are not being used by Send Ports or Receive Locations
2. Roles have no enlisted parties.

Here’s a query used against the mgmt database for those of you on BizTalk 2004/2006 for finding the maps and pipelines that are still deployed that might be getting in your way of undeploying:

select 'RcvPort' PortType, r.nvcName Port, item.name MapName,assem.nvcName Assembly, indoc_docspec_name, outdoc_docspec_name from bts_receiveport_transform rt join bts_receiveport r on rt.nReceivePortID = r.nID join bt_mapspec ms on ms.id = rt.uidTransformGUID join bts_assembly assem on ms.assemblyid = assem.nID join bts_item item on ms.itemid = item.id union select 'SendPort' PortType, r.nvcName Port, item.name MapName, assem.nvcName Assembly, indoc_docspec_name, outdoc_docspec_name from bts_sendport_transform rt join bts_sendport r on rt.nSendPortID = r.nID join bt_mapspec ms on ms.id = rt.uidTransformGUID join bts_assembly assem on ms.assemblyid = assem.nID join bts_item item on ms.itemid = item.id order by Assembly, PortType, Port select 'SendPort' Type, ReceivePortName = '', bts_sendport.nvcName [SendPort/ReceiveLocation], bts_pipeline.[name]PipelineName from bts_sendport join bts_pipeline on bts_sendport.nSendPipelineID = bts_pipeline.[ID] where bts_pipeline.[name]<> 'Microsoft.BizTalk.DefaultPipelines.XMLTransmit' AND bts_pipeline.[name]<> 'Microsoft.BizTalk.DefaultPipelines.PassThruTransmit' union select 'ReceiveLocation' Type, bts_receiveport.nvcName ReceivePortName, adm_receiveLocation.[Name] [SendPort/ReceiveLocation], bts_pipeline.[name]PipelineName from adm_ReceiveLocation join bts_pipeline on adm_receiveLocation.ReceivePipelineID = bts_pipeline.[ID] join bts_receiveport on adm_receivelocation.receiveportid = bts_receiveport.nID where bts_pipeline.[name]<> 'Microsoft.BizTalk.DefaultPipelines.XMLReceive' AND bts_pipeline.[name]<> 'Microsoft.BizTalk.DefaultPipelines.PassThruReceive' order by Type, PipelineName

MS Partner Training on the horizon – wanna date?

MS Partner Training on the horizon – wanna date?

Hot off the ‘Hot-Cross Bun’ RFID Conveyor belt (Happy Easter all also!!!) – myself
and local Sharepoint MVP funny man – Ivan Wilson will be delivering
the sessions…
(How do you have a conversation with more than 3 MVPs in the room??? you don’t- they
all talk about themselves 🙂 – that’s mine, not Ivan’s)
which will be great news….just have to get the content together…shhhhh…you didn’t
hear me say that smile_wink

MS Partner Training Schedule in the land of MOSS

This is for an Instructor Led ‘Chalk &
Talk Session’ designed for Pre-sales Technical Consultants, Technical Consultants,
Technical Project Managers, Architects and Business Analysts
 


Dates First:

Brisbane – April 3 & 4
Melbourne – April 7 & 8
Sydney – April 10 & 11

REGISTER
HERE

What is being covered is:

1.     MOSS
Capability Overview – a brief discussion of the six major functional areas in SharePoint
2007:

o    Collaboration

o    Portals

o    Search

o    Web
Content Management

o    Business
Forms

o    Business
Intelligence

2.    Understanding
the “MOSS Building Blocks” – a description of both the physical and logical components
that make up a SharePoint solution. We discuss how they fit together and how you can
combine these to ensure your solutions can scale to meet demand

3.    A
tour of the Central Administration site – gain an insight into how a SharePoint Farm
is administered.

4.    Applications,
Site Collections and Subsites – explore the main components used to build any SharePoint
site. Learn what capabilities are managed at each level.

5.    Inside
a sub-site – now that we understand the high level components we can get into the
details of what makes up a subsite. We examine:

o    Document
Libraries and the SharePoint Document Management concepts

o    Lists

o    Web
Parts

o    Security

o    Navigation
Controls

6.    Search
– we look into the rich functionality in MOSS to allow users to quickly locate content
that exists inside and outside of SharePoint. We look at how the search capabilities
are administered and what options are available to fine-tune the search engine to
match your client’s needs.

7.    Web
Content Management – we look at how SharePoint incorporates Web Content Management
functionality. This overview includes:

o    Workflows

o    Master
Pages

o    Page
Layouts

o    Content
Deployment

o    Variations

o    Examples
of public sites that use MOSS

8.    Business
Data Catalog – the BDC provides a framework to gain access to information stored in
third-party products. Learn how SharePoint can make use of this content directly within
its own environment

REGISTER
HERE

Goodbye thy:data, hello Logica (WM-data)

Hi everyone

Well, after 7 months at thy:data it seemed clear to me, that thy:data wasn’t the place
I wanted to work. Luckily at the same time, my old employer WM-data, that has changed
its name to Logica, was searching for someone exactly with my skills. So it didn’t
take long for us to reach an agreement, and therefore, on the 1’st April, I am starting
at Logica.

I am really looking forward to this – I will be going back to all my old colleagues,
and my assignments will vary much, going from presale, writing quotes and promoting
Logica to Microsoft to actual design and development of large integration projects,
based on BizTalk, .NET3.0 and stuff.

Just 2 weeks to go…



eliasen

BizTalk migration – Testing

When migrating from BizTalk 2004 (or possibly 2002) to 2006 one might feel tempted to do some refactoring at the same time, fixing things you (or somebody else) could have done better the first time. If the migration is done from a large BizTalk solution, you probably want to make sure everything works as it did before, without spending to much time testing. After all, it’s "just" a migration project.

So if what you are looking for, is some kind of regression testing, leaving your developers to focus on the migration task, you might find this post useful.

The basic principle here is letting messages, passing through the old environment, also passing through the new 2006 solution, and ending up being validated to match each other. It is important to understand that this testing process will only test the content of the message and not any possible connectivity problems. It will require you to deploy your 2006 ports using the File adapter on both receive and send port side. Upon deploying your solution to the production environment these ports would need to be changed to whatever binding you used before. An other important thing to understand is that this process only works with one-way ports. 

To make this happen we need to add a pipeline component at the beginning of every pipeline in the old environment. This component will archive the incoming message to the location where the Receive Location in the new environment will pick it up. Hopefully you can achieve this by copying every pipeline to one new Visual Studio 2003 project, and add the pipeline component to the new pipelines.

The archive component should be designed to require as little system resources as possible, why it uses a Forward only event streaming manner, which basically prevents the stream from being read more than once. To learn more about ForwardOnly streaming, have a look at Johan's blog post.  

Having the updated pipeline assembly, all we need to do is to (1) add an additional Send Port with identical filter settings, and (2) change the pipeline to the updated one, and (3) set the archive path to where ever you set the receive location to in the 2006 environment.

When you’ve done this, the orchestration will pick up the files from the Send Port locations and check if they are identical. The result will be sent to the database, along with the actual messages, for later use.

The orchestration is generic and can be used for every scenario. It has an parallel shape which will prevent it from continue until it receives two files with identical names. To ensure the names comes out with the same name, the pipeline component mentioned above, will promote the ReceivedFileName property (file-property namespace) if it is not set. This will be the case when the original message, for instance, is received from an MQ adapter.

It will then proceed and send the incoming messages to an archive, where you can later pick them up if you need to analyze its content. Next is a decision shape where the actual matching is done. The result will be inserted into a database table, which you may later query to get all failed messages (if any).

If you find this post of interest, let me know, and I'll post the source code.

BizTalk Sftp Adapter – updated to support dynamic send ports

We were asked if the adapter supported dynamic send port, and if it was configurable from within orchestrations. As this was not the case, Johan found himself compelled to solve this. Although the documentation is not yet updated, the source code and setup files are released at codeplex.

If you want to know more about how to use the adapter with dynamic send port, have a look at Johans blog post.