One of the main reasons for using LINQ is that it provides a consistent way to query all sorts of data. You can write LINQ code with hardly a thought for whether you're querying objects, XML or a database. However, there can be subtle differences. As we near the end of this series on LINQ, we're going to explore one of those differences.
In Part 2, we saw that the OrderByDescending extension method on IEnumerable<> takes a Func as its parameter. However, in the previous post we saw that the OrderByDescending extension method of IQueryable<> takes an Expression. Here they are for you to compare.
public static IOrderedEnumerable<TSource> OrderByDescending<TSource, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TKey> keySelector);
public static IOrderedQueryable<TSource> OrderByDescending<TSource, TKey>(
this IQueryable<TSource> source,
Expression<Func<TSource, TKey>> keySelector);
Why the difference? Implementers of IQueryable<> typically are not working with C# objects. The underlying data is more likely to be a SQL database or XML file. Therefore, they have no use for C# code, which is what a Func is. They would rather have a description of code. That's the need that Expressions fill. Usually, the Expressions you create will be Expression Trees. You'll see shortly why they're called that.
Building Expressions
In the Demo7 project of the download that goes with this series, you'll see this code in Program.cs. It's a method that will select Cities according to an arbitrary WHERE clause that is passed in as a parameter.
static IEnumerable<City> GetWhere(
Expression<Func<City, bool>> where,
int numDesired)
{
Func<City,bool> funcWhere = where.Compile();
return Cities
.Where(funcWhere)
.OrderByDescending(c => c.Population)
.Take(numDesired);
}
An aside: The first thing we do with the parameter is Compile() it into a Func<City,bool>. This is only necessary because this code is designed to work with both the IQueryable and IEnumerable flavors of Cities that were in the demo. If you were using an IQueryable-based implementation of LINQ, such as LINQ to Entities, you would want to pass the expression directly to the Where method without compiling it. By the end of this post, you'll understand why!
So how do we come up with an Expression for our 'where' parameter? The easiest way is to assign it from an Expression Lambda. Here's one that select all Cities whose names start with the letter 'P'.
Expression<Func<City, bool>> where = c => c.Name.StartsWith("P");
Now for the hard way. We will build the same Expression (actually an Expression Tree) one step at a time.
// The parameter, c
var parm = Expression.Parameter(typeof(City), "c");
// c.Name
var propName = Expression.Property(parm, "Name");
// The constant, "P"
var constP = Expression.Constant("P");
// c.Name.StartsWith("P")
var nameStartsWith = Expression.Call(
propName,
typeof(string).GetMethod("StartsWith", new[] { typeof(string) }),
constP );
var exprName = Expression.Lambda<Func<City, bool>>(nameStartsWith, parm);
You can see that this is painstaking. We had to follow these steps.
- Remembering that we're creating an Expression<Func<City,bool>>, we create an Expression to represent the City parameter that will be passed to the Func. By the way, the parameter name, "c", is for documentation only. You can see it when you're in the debugger but it does not affect the way the code runs.
- We create an Expression that represents the Name property of that City parameter.
- We even need an Expression for the constant, "P"!
- We create an Expression to represent a call to the StartsWith method of the Name property.
- Finally, we package the whole thing as Lambda Expression. That is what we could pass to our method as the 'where' parameter.
If that's not challenging enough, the sample code continues by building an Expression that corresponds to
c.Name.StartsWith("P") || c.Population > 1000000
Why Build Expressions?
Why go to all that trouble? In most cases, you won't have to. If a LINQ implementation uses Expression parameters, as with IQueryable<>, you'll pass a Lambda and the compiler will create an Expression for you. You might not even be aware that you're really using an Expression.
In other cases, you'll need to build your query piece by piece -- maybe in response to a user's dynamic requests. Then, you'll need the techniques above.
Unintended Consequences
Sometimes there are unintended, hidden consequences of using Funcs instead of Expressions, or putting your LINQ calls in one order rather than another.
Consider these three scenarios, taken from the Demo6 project of the sample code. All of them query the Cities collection, which is a LINQ to Entities (database) object. The query has an arbitrary 'where' clause, passed in as a parameter, orders the selected cities descending by population, and selects the first 'numDesired'.
In each case, I've shown the method and the SQL that LINQ to Entities generates.
static IEnumerable<City> GetWhere(
LinqDemoEntities context,
Expression<Func<City, bool>> where,
int numDesired)
{
return context.Cities
.Where(where)
.OrderByDescending(c => c.Population)
.Take(numDesired);
// SELECT TOP (15)
// [Extent1].[Name] AS [Name],
// [Extent1].[StateCode] AS [StateCode],
// [Extent1].[Population] AS [Population]
// FROM [dbo].[Cities] AS [Extent1]
// WHERE [Extent1].[Name] LIKE N'P%'
// ORDER BY [Extent1].[Population] DESC
}
That is the best solution. Unlike the others you'll see, all of the LINQ ends up in the SQL statement, meaning that the database will not return anything we don't want.
Contrast that with what happens if we change the Expression<Func<,>> parameter to a simple Func<,>.
static IEnumerable<City> GetWhere(
LinqDemoEntities context,
Func<City, bool> where,
int numDesired)
{
return context.Cities
.Where(where)
.OrderByDescending(c => c.Population)
.Take(numDesired);
// SELECT
// [Extent1].[Name] AS [Name],
// [Extent1].[StateCode] AS [StateCode],
// [Extent1].[Population] AS [Population]
// FROM [dbo].[Cities] AS [Extent1]
}
Notice the generated SQL. It's returning the entire table! What happened to the WHERE, ORDER BY and TOP clauses??
What happened is this.
- The compiler sees that there is no Where extension method on IQueryable<> that takes a Func<,> as a parameter. Rather, IQueryable's Where method expects an Expression<Func<,>>.
- However, IQueryable<> inherits from IEnumerable<> and IEnumerable<> does have a Where extension method that takes a Func<,>. The compiler therefore uses IEnumerable<>'s extension method.
- So the Cities were filtered as an IEnumerable<>, which is to say that LINQ to Entities didn't get a chance to do anything more than serve up the entire table.
Quiz: What do you think would happen if the OrderByDescending were to precede the Where?
static IEnumerable<City> GetWhere(
LinqDemoEntities context,
Func<City, bool> where,
int numDesired)
{
return context.Cities
.OrderByDescending(c => c.Population)
.Where(where)
.Take(numDesired);
}
How many of these steps did you predict?
- Cities flows into OrderByDescending as an IQueryable<>.
- The compiler sees that IQueryable<> has an extension method that takes an expression.
- The compiler creates an Expression for you from the Lambda, c => c.Population.
- OrderByDesending is thus processable by the IQueryable<> LINQ provider, namely LINQ to Entities. So far, so good.
- The 'where' variable is a Func, not an Expression. As we saw in the second scenario, the compiler sees that there is no Where(Func<,>) extension method on IQueryable<>, so it processes the Where in IEnumerable<> land. That means it is not passed to LINQ to Entities and does not become part of the SQL.
- The resulting SQL is as follows. ORDER BY appears, but there is no WHERE.
// SELECT
// [Extent1].[Name] AS [Name],
// [Extent1].[StateCode] AS [StateCode],
// [Extent1].[Population] AS [Population]
// FROM [dbo].[Cities] AS [Extent1]
// ORDER BY [Extent1].[Population] DESC
I hope you have enjoyed this series on LINQ. If you have any questions, please leave a comment!