WF4 runtime performance

Welcome to the first in a series of blog posts on WF4 performance. In this post, we will discuss the core workflow runtime. An important reason many of our customers are moving from WF3 to WF4 is the large performance improvements they realize with the WF4 runtime. This post describes a few of these improvements, provides some background on what we did to make it happen, and warns about some common pitfalls. Please note that the data contained in this post is based on prerelease software and is subject to change.

As an example, consider the following While activity built with WF4:

Variable<int> i = new Variable<int>();
new While(env => i.Get(env) < 100)
{
    Variables = { i },
    Body = new Increment
    {
        X = i,
    },
};


public class Increment : CodeActivity
{
    public Increment()
        : base()
    {
    }

    public InOutArgument<int> X
    {
        get;
        set;
    }

    protected override void Execute(CodeActivityContext context)
    {
        X.Set(context, X.Get(context) + 1);
    }
}

In the lab, we observe that the throughput of this WF4 workflow is 273 times the throughput of a comparative WF3 workflow. In other words, on this lab machine, we execute 273 times as many of WF4 workflows per second as the WF3 workflow.

Now, the performance improvements will vary greatly from one workflow definition to another. It will be far less dramatic for many other workflows, or even more dramatic for others, but overall, these improvements symbolize the work we’ve put into making WF4 far more performant.

There are multiple behind-the-scenes reasons for why you see these improvements:

– Introduction of a formal data flow model and scoping rules

There were no scoping rules for data in 3.0 or 3.5 – you could bind activity properties without clear boundaries. So, we had to keep significant state around for the lifetime of your workflow.

In WF4, we provide a consistent model for sharing state among activities (with variables, arguments, and expressions). More importantly, we also have rules for data scoping that enable us to do very efficient cleanup of state when it is no longer needed. In WF4, we consider only the variables of currently executing activities active state.

Earlier, it was impossible for the runtime to reason about what was in scope, and what state wasn’t needed any more, and this forced us to carry around a lot of unnecessary state.

– Separation of activity definition from activity instance

In WF3, activity types were configured at design time, and the same types were instantiated at runtime. This model had performance implications, because for each workflow instance, the entire activity tree was kept in memory (or persisted when it was unloaded).

In .NET 4.0, we’ve separated the notion of program definition from activity instance. The type you use to author an activity is instantiated only as part of a workflow definition. It is not maintained as part of instance state.

Instead, scheduling in WF4 creates lightweight ActivityInstances which represent the execution state for an activity. As I mention in the first bullet point, this execution state primarily includes the values of variables declared on this activity.

– Improved activity cleanup and activity creation

Beyond the separation of activity definition from activity instance, we also modified the activity lifecycle so that we proactively cleaned up activities that were no longer executing. Just as important, we’ve removed the initialized state from activities. Now, an activity instance is only created when it is scheduled.

– Elimination of bottlenecks that made looping constructs significantly less performant in WF3

For all the reasons described above, writing activities that performed any kind of looping in WF3 was challenging, because you had to maintain a template activity, spawn, and cleanup children. Also, in WF3, every time we tried to run the child of a While, we would use the binary formatter to replicate the entire tree every time. This was extremely expensive, and as a result, performance-sensitive workflow developers started eliminating loop constructs from workflows altogether!

Now, because activity types are instantiated only as part of the workflow definition, and because it is not maintained as part of instance state, not only is writing looping activities far easier, scheduling them is also significantly more performant.

This should begin to provide a view of how we’ve invested in performance on just the workflow runtime. Discussion of persistence has been intentionally left out of this discussion. There will be a different, future post on persistence performance.

Two gotchas to keep in mind: Before you leave, there are two performance warnings about a few common mistakes we’ve seen customers make.

– Not caching the workflow definition. The first common mistake we’ve seen customers make is not caching the workflow definition before execution. In other words, instead of creating the definition and using it during subsequent executions, some customers unintentionally re-create the definition from scratch during every invocation (in the example above, that would mean instantiating a new While activity before every execution). Recreation of the workflow definition is a bad idea not just because of the object creation, but also because preparing the definition involves calling CacheMetadata to build the runtime activity tree.

– Blocking the thread. The second common mistake is blocking the thread, often unknowingly relying on expensive operations in activities. In the WF4 threading model, one thread at a time is used to process work items in a workflow instance. While this helps simplify the programming model for activities, it also means that activities should not block this thread. Note that expensive operations come in many forms, so keeping this warning in mind is not always obvious. Also, WF4’s new asynchronous activity capability allows you to perform work off of the workflow thread while keeping the instance in memory; for more information, please see the new async activity sample in the .NET 4 RC SDK.

If you’d like to see any other topics discussed in the future, please let us know!

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

WF4 runtime performance

Search this Site:

Recent Posts

Recent Topics