[Source: http://geekswithblogs.net/EltonStoneman]
Recently I’ve been putting together a generic approach for paging the response from a WCF service. Paging changes the service signature, so it’s not as simple as adding a behavior to an existing service in config, but the complexity of the paging is isolated in a generic base class.
We’re using the Entity Framework talking to SQL Server, so when we ask for a page using LINQ’s.Take() method we get a nice efficient SQL query for just the rows we want, with minimal impact on SQL Server and network traffic. We use the maximum ID of the record returned as a high-water mark (rather than using .Skip() to go to the next record), so the approach caters for records being deleted between page requests.
In the paged response we include a HasMorePages indicator, computed by comparing the max ID in the page of results to the max ID for the whole resultset – if the latter is bigger, then there are more pages.
In some quick performance testing, the paged version of the service performed much more slowly than the unpaged version, which was unexpected. We narrowed it down to the code which gets the max ID for the full resultset – instead of building an efficient MAX() SQL query, EF was returning the whole resultset and then computing the max ID in the service layer.
It’s easy to reproduce – take this AdventureWorks query:
var context = new AdventureWorksEntities();
var query = from od in context.SalesOrderDetail
where od.ModifiedDate >= modified
&& od.SalesOrderDetailID.CompareTo(id) > 0
orderby od.SalesOrderDetailID
select od;
We can find the maximum SalesOrderDetailID like this:
var maxIdEfficiently = query.Max(od => od.SalesOrderDetailID);
which produces our efficient MAX() SQL query. If we’re doing this generically and we already have the ID function in a Func:
Func<SalesOrderDetail, int> idFunc = od => od.SalesOrderDetailID;
var maxIdInefficiently = query.Max(idFunc);
This fetches all the results from the query and then runs the Max() function in code. If you look at the difference in Reflector, the first call passes an Expression to the Max(), while the second call passes a Func. So it’s an easy fix – wrap the Func in an Expression:
Expression<Func<SalesOrderDetail, int>> idExpression = od => od.SalesOrderDetailID;
var maxIdEfficientlyAgain = query.Max(idExpression);
– and we’re back to running an efficient MAX() statement.
Evidently the EF provider can dissect an Expression and build its equivalent in SQL, but it can’t do that with Funcs.