Understanding the BizTalk Mapper: Part 12 - Performance and Maintainability

In this section:
Performance

   Summary of Tests

   Testing performance in isolation
(non-BizTalk)

   Performance Test Results

   Measuring Memory Usage in BizTalk

   BizTalk Memory Test Results

   Byte Arrays

   Analysing the performance results

Maintainability

   External XSLT

   Serializable Classes

   Why is it so difficult
to edit code in the Script functoid?

   Documentation

Any large BizTalk project will likely have had the inevitable conversations about
performance and maintainability: will it be fast/sustainable enough, and will the
tech support team (or whoever looks after the code once the developers have finished)
be able to maintain it?

In this post I want to look at the performance of the Mapper, and also look at how
maintainable maps are generally.

In order to do this, I want look at the different options you have for executing XSLT
with the Mapper, and compare this to the most common non-Mapper mechanism for performing
transformation: using serializable classes.

Note:

This is the twelfth in a series of 13 posts about the BizTalk Mapper.
The other posts in this series are (links will become active as the posts become active):
Understanding
the BizTalk Mapper: Part 1 – Introduction

Understanding
the BizTalk Mapper: Part 2 – Functoids Overview

Understanding
the BizTalk Mapper: Part 3 – String Functoids

Understanding
the BizTalk Mapper: Part 4 – Mathematical Functoids

Understanding
the BizTalk Mapper: Part 5 – Logical Functoids

Understanding
the BizTalk Mapper: Part 6 – Date/Time Functoids

Understanding
the BizTalk Mapper: Part 7 – Conversion Functoids

Understanding
the BizTalk Mapper: Part 8 – Scientific Functoids

Understanding
the BizTalk Mapper: Part 9 – Cumulative Functoids

Understanding
the BizTalk Mapper: Part 10 – Database Functoids

Understanding
the BizTalk Mapper: Part 11 – Advanced Functoids

Understanding the BizTalk Mapper: Part 12 – Performance and Maintainability

Understanding the BizTalk Mapper: Part 13 – Is the Mapper the best choice for Transformation
in BizTalk?

When all of the above have been posted, the entire series will be downloadable as
a single Microsoft Word document, or Adobe PDF file.

In BizTalk, your options for transformations using the Mapper are:

Built-in functoids
Inline XSLT script
Inline C# / VB.NET / JScript.NET
Classes/methods in external assemblies
Custom functoids
External XSLT file

Normally you’d use a combination of these in a complex BizTalk solution.

So how do you decide which to use?
Which gives you the best performance?
Which option(s) are the easiest to maintain?

The answer is: it depends!

In this section I’m going to try and give you some hard data you can use to try and
answer these questions. In the next post, I’ll try and answer the questions.

Performance

Performance is a very subjective subject. You could spend weeks getting your maps
to execute in under 10ms, but if your performance requirement is "anything less
than 100ms" then why bother?

What’s more important is that performance is "good enough" and is sustainable.

Sustainable performance is the key – its one thing to be able to perform a single
transformation in less than 10ms, but what about performing 40 simultaneous transformations
continuously for over an hour?
And what about the memory footprint?

When we talk about performance, we also need to look at memory/resource usage: If
you have a very fast system that for each transformation uses 1MB memory, then at
a certain point under sustained load your memory usage will start to affect performance
as memory is paged out to disk (assuming that garbage collection can’t keep up with
the number of transformations you’re performing).

In order to find out the performance and memory footprints of the various options,
I put together some tests.

I ran two suites of tests:

Using a stand-alone test harness with the XslTransform class
to perform transformation in isolation (this gives the comparative performance)
Using a BizTalk application to measure memory usage (this gives the comparative memory
usage)

Summary of Tests

I simplified the six options above into four separate XSLT tests, and then added two
tests involving serializable classes for comparison:

Standard Map

A map using default built-in functoids, with a mix of functoids that emit XSLT and
functoids that emit C#
Map using external XSLT

A map which used an external XSLT file consisting of pure XSLT i.e. no external assemblies
or inline script
Map using inline C#

A map using inline C# code in a Script functoid
Map using external assembly

A map using an external assembly via a Script functoid
Transformation using Classes

De-serializing the input message into a class, transforming to a new class, and serializing
the class back into the output message
Transformation using classes and serializers

Same as 5. above but using the sgen.exe tool
to pre-generate a Serialization class

My test scenario was: Transforming a message containing 20 employee records into a
new message which contained separate Manager and Staff records:
i.e. this:

becomes this:

Each of the maps performed the transformation in the same way, as much as was possible.

Testing performance in isolation (non-BizTalk)

For the XSLT tests, I tested using both the XslTransform class
(used by BizTalk) and the newer XslCompiledTransform class
(for comparison).

For both of these class tests I used static and non-static instances of the classes:

In the static test a transform class was created once and then re-used for each iteration
In the non-static test a new transform class was created for each iteration

I ran each test twice: once with 20 iterations, and once with 500 iterations.

I also measured the amount of memory in use before and after each test – from this
I calculated a rough "memory used" for each test (this is before garbage
collection kicked in).

For each test I show:

Total

This is the total time (in ms) that the test took to run
Average

This is the average time for each iteration
Average without first

This is the average time without the first iteration
Average setup

This is the average time it took to create the transform object in each iteration
(will be 0 for the static tests as creation of transform is performed once and not
included in the timings)
Average transform

This is the average time it took to perform the transformation
Memory Used

This is the amount of memory used to perform the entire test

Performance Test Results

The results of the 500 iteration tests are:

	Standard Map	Map with External Assembly	Map with Inline Script	Map with External XSLT	Class Transform	Class Transform with Serializer
Iterations	500	500	500	500	500	500
XslTransform – NonStatic
Total (ms)	82402	78815	79507	1622	1749	728
Average (ms)	164	157	159	3	3	1
Average without first (ms)	164	157	158	3	2	1
Average setup (ms)	151	151	152	1	1	0
Average transform (ms)	12	5	5	1	1	1
Memory Used	504MB	452MB	456MB	62MB	90MB	86MB
XslTransform – Static
Total (ms)	4392	2234	1564	720	769	735
Average (ms)	8	4	3	1	1	1
Average without first (ms)	8	4	3	1	1	1
Average setup (ms)	0	0	0	0	0	0
Average transform (ms)	8	4	3	1	1	1
Memory Used	64MB	40MB	40MB	30MB	88MB	88MB
XslCompiledTransform – NonStatic
Total (ms)	101214	88181	89608	16325
Average (ms)	202	176	179	32
Average without first (ms)	202	176	179	32
Average setup (ms)	156	145	147	3
Average transform (ms)	45	30	30	28
Memory Used	192MB	141MB	145MB	82MB
XslCompiledTransform – Static
Total (ms)	95	131	70	55
Average (ms)	0	0	0	0
Average without first (ms)	0	0	0	0
Average setup (ms)	0	0	0	0
Average transform (ms)	0	0	0	0
Memory Used	14MB	15MB	13MB	13MB

(the lowest result on each row is highlighted in green)

And for comparison, the results from the 20 iteration tests:

	Standard Map	Map with External Assembly	Map with Inline Script	Map with External XSLT	Class Transform	Class Transform with Serializer
Iterations	20	20	20	20	20	20
XslTransform – NonStatic
Total (ms)	3820	3931	3718	78	1213	82
Average (ms)	191	196	185	3	60	4
Average without first (ms)	171	194	184	3	4	1
Average setup (ms)	175	189	178	1	56	1
Average transform (ms)	15	7	6	1	4	2
Memory Used	20MB	18MB	17MB	2MB	4MB	3MB
XslTransform – Static
Total (ms)	169	100	66	29	122	83
Average (ms)	8	5	3	1	6	4
Average without first (ms)	8	4	3	1	2	1
Average setup (ms)	0	0	0	0	2	1
Average transform (ms)	8	4	3	1	3	2
Memory Used	4MB	1MB	1MB	1MB	3MB	3MB
XslCompiledTransform – NonStatic
Total (ms)	4628	3758	5094	1190
Average (ms)	231	187	254	59
Average without first (ms)	224	183	254	50
Average setup (ms)	180	154	211	11
Average transform (ms)	50	32	42	48
Memory Used	8MB	6MB	6MB	3MB
XslCompiledTransform – Static
Total (ms)	46	39	67	81
Average (ms)	2	1	3	4
Average without first (ms)	0	0	1	1
Average setup (ms)	0	0	0	0
Average transform (ms)	2	1	3	4
Memory Used	642KB	647KB	583KB	542KB

(the lowest result on each row is highlighted in green)

Measuring Memory Usage in BizTalk

Although the Memory Used amount from the performance tests was useful, I wanted
to know exactly how much memory BizTalk used for performing transformations – and
what objects were in memory.
In order to measure this I used SciTech Software’s .NET
Memory Profiler.
This tool attaches to the BizTalk service (BTSNTSvc.exe) and can create a snapshot
of all the object instances currently in use, including how many there are and how
much memory they’re using.

I created a BizTalk application which contained a separate map for each of the tests
above, and created orchestrations to execute the maps (and to call the C# code to
perform the transform using the classes).

I performed a memory snapshot before and after running the orchestrations, and restarted
the BizTalk service between each test.

I ran each test twice: once with a single message, and once with 50 messages.

For each test I show:

Byte[] Instances

This is the count of Byte arrays in use (the relevance of Byte arrays is explained
below)
Byte[] Instances Size (MB)

The is the total size of all Byte arrays (i.e. the size of all the data they contain)
Total Instances

The is the count of all .NET objects in use
Total Instances Size(MB)

This is the total size of all .NET objects in use

Note that any sizes measured are after garbage collection has occurred i.e.
these are objects which are still classed as being in-use.

BizTalk Memory Test Results

The results I measured were:
Single Message:

Test – 1 Iteration	Byte[] Instances	Byte[] Instances Size (MB)	Total Instances	Total Instances Size (MB)
Standard	5,652	0.63	17,277	1.78
External XSLT	5,752	0.59	19,038	2.11
Inline Script	5,763	0.63	20,522	2.23
Referenced Assembly	5,757	1.25	19,897	2.42
Class	30	0.03	7,476	0.74

(the lowest result in each column is highlighted in green)

50 Messages:

Test – 50 Iterations	Byte[] Instances	Byte[] Instances Size (MB)	Total Instances	Total Instances Size (MB)
Standard	146	1.63	18,354	3.24
External XSLT	126	0.80	14,359	1.87
Inline Script	5,780	1.89	22,590	3.13
Referenced Assembly	5,757	1.36	20,100	3.00
Class	146	0.85	14,202	1.81

(the lowest result in each column is highlighted in green)

Byte Arrays

I can let Microsoft explain this in their own words (this is taken from a knowledge
base article here):

The System.Policy.Security.Evidence object is often used
in transforms and can consume a lot of memory. Whenever a map contains a scripting
functoid that uses inline C# (or any other inline language), the assembly is created
in memory. The System.Policy.Security.Evidence object uses the object of the actual
calling assembly. This situation creates a rooted object that is not deleted until
the BizTalk service is restarted.

Most of the default BizTalk functoids are implemented
as inline script. These items can cause System.Byte[] objects to collect in memory.
To minimize memory consumption, we recommend that you put any map that uses these
functoids into a small assembly. Then, reference that assembly. Use the chart below
to determine which functoids use inline script and which functoids do not use inline
script.

In the second column, “Yes” means that this functoid is
implemented as inline script, and that it will cause System.Byte[] objects to collect
in memory. “No” means that this functoid is not implemented as inline script, and
that it will not cause System.Byte[] objects to collect in memory.

Functoids	Inline script?
All String Functoids	Yes
All Mathematical Functoids	Yes
All Logical Functoids except IsNil	Yes
Logical IsNil Functoid	No
All Date/Time Functoids	Yes
All Conversion Functoids	Yes
All Scientific Functoids	Yes
All Cumulative Functoids	Yes
All Database Functoids	No
All Advanced Functoids (apart from Script functoids using Inline C#/VB/Jscript)	No

Basically what they’re saying is: whenever you
use the default functoids, or inline code (i.e. C# or any other .NET language) the
assembly containing the map is used as evidence to the XslTransform class
to ensure that it’s safe to use scripts. And this assembly is loaded into the appropriate
AppDomain and kept there. In fact, it’s loaded in as a byte array (byte[]).

Assemblies created by compiling inline script in an XSLT are temporary assemblies
and are loaded into the appropriate AppDomain – they will remain in memory until the
AppDomain is unloaded i.e. the BizTalk Host Instance is restarted.

So if you keep all your maps, orchestrations, schemas, etc in one big assembly, then
that assembly will stay loaded in memory until the BizTalk Host Instance that loaded
it is restarted.
Solution? Keep your maps (especially maps using inline C#) in a separate project/assembly
– and try and keep that assembly as small as possible!

Analysing the performance results

The results shouldn’t really come as any surprise.
What they say in a nutshell is: pure XSLT is much faster than XSLT which uses inline
script or referenced assemblies. How much faster? Well, my tests show an average 5500%
speed increase over using the default functoids (i.e. 55 times faster)!
Additionally, using pure XSLT uses 1/8 the memory.
Of course, your mileage will vary.

What’s interesting though is how fast using serializable classes is i.e. de-serializing
a message into a class, performing operations to create a new class, and then serializing
the new class into a message.
When used with a pre-generated serialization assembly, this mechanism chases closely
behind using pure XSLT (and actually beat it in one of the tests).

Maintainability

One of BizTalk’s trump cards is the BizTalk Mapper: you can create maps which can
be easily maintained – what’s more, because the Mapper is a visual tool you can see
at a glance how the mapping works.
At least, that’s the theory.
If you have a relatively simple map which uses no script functoids, and has less then,
say, 50 connections then Yes, I’d say this was true: it’s easy to see what the map
does, and probably easy to maintain it.

But if you have a map with 1000 connections, or with a whole smattering of Script
functoids (or a bird’s nest of Logical functoids) then No, I don’t think it’s easy
to see what the map does or to maintain it.

At what point do you have to admit defeat with a Map and say that it’s got as bit
too complex? or that the next developer to come along will have problems maintaining
it?
In that case, would you be better off using external XSLT or serializable classes?

External XSLT

One of the main complaints I hear about using external XSLT with the Mapper is that
it’s difficult to maintain. This can be true – if your editing tool is notepad! But
there are great tools around for maintaining XSLT – have a look at Altova’s
MapForce for an example of one.
The other complaint is that XSLT is difficult, or hard to learn. Well, so is C# if
you’ve never used it before.
Go buy a book on it, or do a course!

Truth be told, if you work for a company which uses XSLT for other projects, then
you’re more likely to find support for using it as external XSLT in BizTalk.
Some companies have teams in IT which do nothing but create XSLT.

My point here is that although there’s a myth that external XSLT is hard to maintain,
it’s a lot easier to maintain than a complicated map. And if you use the right tool,
you can get a graphical view of what it does as well.

Serializable Classes

In my experience, it’s very very common for developers to create a whole slew of transform
and utility classes for handling transformations.
Sometimes this is because it’s the best way to do it.
Other times it’s because they simply didn’t know how to achieve something in the Mapper.
One of the best features of BizTalk (the ability to call out to C# classes/methods
from an orchestration or map) can also be its worst: Just because you *can* create
C# code, doesn’t mean that you *should*.

Maintaining poorly written C# code is a nightmare.
So if you’re using serializable classes to perform transformations make sure they’re
well written and well documented – but most importantly: understand *why* you’re using
them over XSLT.

Why is it so difficult to edit code
in the Script functoid?

Ever wondered why you can’t resize the Script functoid code window? Or do a Ctrl-A
to select all the code in it? It turns out there is actually a reason for it…

Scott Woodgate (former Product Manager for BizTalk) had
this to say…
The article is correct when it points out we discourage the use of .NET objects
directly inside the map. While this is possible, we encourage good developer design
which is encapsulating code in external assemblies. This turns out to be much better
because you have a single assembly with code shared across multiple maps that can
be versioned once and managed more easily

Unfortunately, this restriction also means that it’s difficult to use inline XSLT,
which is a shame.

Documentation

You can’t get away from the fact that code that is easy to maintain is either self-documenting,
or is accompanied by excellent documentation.
Regardless if you use maps, external XSLT, or serializable classes you should really
document your transforms: explain what they do, how they do it, why they do it – and
most importantly, give some context: explain why you chose to do it that particular
way.
A developer who has to maintain your code in 2 years time might not have your background
of development and political issues to understand why you made your choices.

The next post is going to look at the performance/maintainability of the different
transformation options and attempt to help you to decide when to use which option.

Understanding the BizTalk Mapper: Part 12 – Performance and Maintainability

Submit a Comment Cancel reply

Search this Site:

Recent Posts

Recent Topics