[Source: http://geekswithblogs.net/EltonStoneman]

In my previous post, Cloud Services and Command-Query Separation: Part 2, I walked through a sample Command-Query Separated service bus solution using readily available cloud services for communication. In this one, I’ll look at some of the implications of shifting systems integration to the cloud, compared to an on-premise ESB. The focus here is mainly on Amazon Web Services, but I’ll cover Azure with a dedicated sample project.

Cost & Non-Functional Requirements

If an ESB is core to your system landscape, it needs to be reliable and scalable. With a BizTalk-based solution you get this built-in with the architecture, but it doesn’t come cheaply. For a minimally-resilient enterprise-grade solution you’ll need at least two BizTalk servers in a group running against a SQL Server cluster on two nodes; for DR you’d need the same again in a separate data centre. Buying all the hardware and licenses could take you to %u00a3100K, not allowing for ongoing operating, maintenance and site costs.

In comparison the cloud solution is likely to be far more reliable at a far lower cost. Both AWS and Azure have multiple data centres around the globe, with redundancy levels you couldn’t realistically achieve in a private data centre. Scalability is a given for the queuing and storage services – you just use them as much as you need, and the service scales to cope. If you’re hosting the service provider nodes in the cloud too – either as a Windows Azure service, or within an Amazon EC2 instance – then they can be configured to automatically scale just as easily.

Cost-wise, there’s no comparison. For the AWS services used in the sample, the “free tier” means if you’re sending up to 100,000 messages a month on SQS with up to 1Gb total payload, and using less than 1Gb of storage on SimpleDB then there are no charges at all. The next level up incurs monthly costs (at time of writing) of around %u00a30.20 for an additional 100,000 messages with 1Gb payload, and %u00a30.30 per Gb of data in SimpleDB.

With the roughest calculation, you could push 10,000,000 messages through SQS, and use 100Gb storage on SimpleDB EVERY MONTH FOR 160 YEARS before approaching %u00a3100K in costs. Of course the BizTalk and cloud solutions are not comparable in terms of the functionality they offer, but for certain scenarios they can be compared as they may be used in similar ways.

Security and Interoperability

Cloud services are based on interoperable standards – typically SOAP and REST – and are published on publicly-available endpoints. Azure has clever tricks to navigate firewalls for WCF bindings, and AWS uses standard HTTPS endpoints. To make an internal service publicly available you need to configure DMZs, domain names and firewalls, and sign up for ongoing vigilance against attack.

In AWS, transport-level security on the endpoints is supported by message-level security, with every request containing a signature built using the sender’s secret key to verify that the content has not been tampered with. AWS also lets you secure or deny services to a specific list of AWS user accounts, so you can limit use of your queues and data stores to business partners. It’s fairly simple to interact with AWS services in an iPhone app.

On-premise gives you more options for tailoring your solution. For example, you may have a secure, interoperable entry point to the ESB, but between ESB and service providers the communication is private so you could standardise on efficient WCF over TCP/IP, using IPSec to prevent any machines other than the ESB nodes accessing the service providers. Calling services with an iPad is likely to be more of a challenge.

Time to Market

Or Time to Release, depending on your scenario. Starting from zero, you can be up and running with a simple cloud service bus with less than a day’s effort. Assuming you’re exposing existing functionality then you could conceivably have a pilot project deployed for testing in the cloud within a week. Adding new services is as simple as adding a fa%u00e7ade over existing code, to act as a handler when a known type of message request is received. Business partners can use your test environment with no special effort, and when you go live, decommissioning the test environment is just a case of deleting the queues and data stores with a few API calls.

On-premise, all the delivery times go up. Development, deployment and testing of a basic ESB solution – even if you use BizTalk and the ESB Toolkit as a starting point – is likely to be closer to a month than a week. The contract-first nature of BizTalk means adding a new service or changing an existing one requires deployment of ESB artifacts as well as the new service provider – this not a negative, but it does add to the effort required.

Performance

In terms of latency, the cloud solution is never going to perform as well as on-premise. Pushing an ESB request and response through the BizTalk message box may add 0.5 seconds of latency, but calling out to the web is going to be double or quadruple that. That network latency in fetching a large query response could make the solution unworkable for end clients, compared to the LAN option.

The reverse may be true for globalised organisations which have their data centres in one region. While the on-premise solution needs to negotiate the WAN, the cloud solution has the option to push out to edge nodes, with independent queues and data stores residing in the nearest region to the consumer. In this scenario the cloud solution is likely to have the lower latency.

In terms of processor power and scaling up to meet demand, the cloud solution can provide that more-or-less instantly, with no service disruption and with a linear cost increase. Scaling an on-premise solution will always take longer to commission, for scaling up there will be downtime for the upgraded nodes and scaling out will mean cost spikes.

Limitations

In the cloud world, you currently have to live with limitations that will seriously affect your design. SQS allows maximum message sizes of 8Kb, which is entirely tiny, but should be a size you can cope with for service requests and responses. You may have to give more consideration to serialization than you normally would, using JSON or Protocol Buffers rather than bloated XML.

The BizTalk solution will happily deal with XML messages in the hundreds of megabytes, but to do so efficiently it needs to be tuned to favour large messages over throughput.

Governance

This is where the cloud option does really poorly. One of the key advantages of BizTalk and the ESB Toolkit is that I can navigate UDDI and get a list of all available services and their endpoints; then I can navigate the endpoint WSDL and see what the service contract is, or I can look up the XSD in BizTalk. The service and contract lookups can be wrapped into a custom tool, giving you a real-time ESB navigator which is fairly trivial to build.

In the sample cloud service bus there’s no separate catalogue of services or contracts. Services exist as long as there is a subscriber for a particular type of message; request and response contracts may be explicit in .NET classes, but across the bus they’re just strings of text with no option to validate them. For a robust cloud solution, governance needs to be in from the start, which is likely to mean some custom operations services which can be queried to get that service catalogue and contracts.