I went throught this exercise with a client last summer – it was fairly long and drawn
out. The organization had always had a physically distinct middle tier that
was responsible for data access – based on the belief that scalability and security
would both be improved.
For the application in question at the time, DataSets were being returned from middle-tier
objects – marshaled via .NET Remoting (binary/tcp). Now, DataSets don’t have
a very compact representation when serialized, as has been described here.
Whether due to the serialization format or other aspects of the serialization process,
our performance tests indicated that the time spent serializing/deserializing DataSets
imposed a tremendous CPU tax on both the web server and the application server – even
after implementing techniques to
address it. The throughput (requests per second) on the web servers & request
latency also suffered dramatically.
We conducted extensive tests using ACT (Application Center Test) to drive 40 virtual
users with no dwell time (i.e. each driver thread executed another request as soon
as the last returned.) Two web servers were used, and in the remoted middle-tier
case, we had a single middle-tier server. A single Sql Server was used.
All servers were “slightly aged” four-processor machines. The read operations
brought back large DataSets, whereas the write operations were fairly simple by comparison
– the workload was intended to simulate real user profiles.
Requests Per Second (RPS)
Remoted middle tier
Local middle tier
Remoted middle tier
Local middle tier
Notice that not only was the local middle tier (non-remoted case) able to sustain
a much higher throughput, but it had far less latency as well. CPU utilization
indicated we would need one physical middle tier server for every web server.
(Of course, when comparing raw performance of “physical middle tier vs. not”, you
always need to ask “what would happen if I deployed these middle tier servers as front-end
web servers instead? In practice, you don’t even need to go that far – just
getting rid of the middle tier servers altogther will often improve performance…)
So, after evaluating performance, we decided we wanted to push for
a local middle tier, and allow (gasp) access to the database from the DMZ. This
led us to a long and serious discussion of the security implications, and our reasoning
followed Ian’s quite closely. The Threats
and Countermeasures text was a very valuable resource. We certainly avoided
the use of all dynamic Sql (in favor of using stored procedures), used a low privilege
(Windows) account to access Sql Server (that only had access to stored procedures),
used strongly-typed (SqlParameter) parameters for all database calls (that are type/length
checked), avoided storing connection strings in the clear via the .NET config encryption
mechanism, used non-standard ports for Sql Server, etc. The quantity of advice
to digest is large indeed – but necessary regardless of whether you are deployed with
a physical middle tier or not…
Two closing thoughts on this topic….First, Martin Fowler sums up this whole
topic well in his book Patterns of Enterprise Application Architecture (Chapter 7)
on the topic of “Errant Architectures” – excerpted in a SD
magazine article. After introducing the topic, he says:
“…Hence, we get to my First Law of Distributed Object Design: Don’t
distribute your objects! How, then, do you effectively use multiple processors [servers]?
In most cases, the way to go is clustering [of more front-end web servers]. Put all
the classes into a single process and then run multiple copies of that process on
the various nodes. That way, each process uses local calls to get the job done and
thus does things faster. You can also use fine-grained interfaces for all the classes
within the process and thus get better maintainability with a simpler programming
model. …All things being equal, it’s best to run the Web and application
servers in a single process—but all things aren’t always equal. “
Second, does anyone remember the nile.com benchmarks that
DocuLabs conducted? I can’t find the exact iteration of the benchmark
I’m looking for, but they found that ISAPI components calling local COM+ components
on a single Compaq 8500 (8-way) could achieve 3000 requests per second, vs. just 500
once the COM+ components were placed on a separate Compaq 8500. Unreal.
(And by the way, with those numbers, what the heck was wrong the ASP.NET code above?
Oh well, nile.com WAS a benchmark after all…)