Entity Framework use Expression instead of Func Delegate

When we use entity framework to manipulate SQL database, we use where to query and include to load related entities (join operation).

Sample database table schema: Employee: Id, Name, Age Salary: Id, BasePay

Typical query scenarios:

  • Query employee 3's name => Employee.Where(x => x.Id == 3)
  • Query employee Jack's age => Employee.Where(x => x.Name == "Jack")
  • Query employee Jack's basepay => Employee.Where(x => x.Name == "Jack").Include(x => x.Salary) ...

To make the code clean and focus, the following examples will not include dbContext creation logic, we assume db = DbContext() which contains the Employee table.

For code reuse purpose, we may extract a common method like this:

Employee GetEntity(Func<Employee, bool> whereFunc, Func<DbSet<Employee>, IQueryable<Employee>> navigationFunc = null)
{
    IQueryable<Employee> query = navigationFunc != null ? navigationFunc(db.Employee) : db.Employee;
    var employee = await query.Where(x => whereFunc(x)).ToListAsync();
    return employee;
}

The implmentation looks simple and straightfoward: just extract the where and navigation Func<> to a function.

However, the above function has severe perfomance issue: when we execute the function, it retrieves all data from the table and locally applies where filter. Actually, the underlying mechanism is not that simple as entity framework will compile the expression and translated to SQL statements, though it looks like just a syntactic sugar.

Problem Analysis

The reason is well explained in the following post: # Don’t use Func delegate on the Entity Framework entities

The post said that .Where Linq method has two overloads. One is IQueryable while another is IEnumerable. When we use the above function, IEnumerable will be used so that all data will be retrieved to local.

I agree with the conclusion. However, when I use F12 to go to its definition, they are the same:

public static IQueryable<TSource> Where<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate);

Since IQueryable implements IEnumerable, it falls back to IEnumerable. Anyway, we need to pass Expression<Func> to the common function.

Take Employee.Where(x => x.Name == "Jack").Select(x => x.Name) for example. If you print out the translated SQL, you will find that the where expression is missing:

SELECT `x`.`Name`
FROM `XXX`.`Employee`
And that's why we have severe performance issue. Next we discuss how to write correct Expresssion<Func> and its usage.

Where Expression

The original Func<> function:

Employee GetEntity(Func<Employee, bool> whereFunc)
{
    IQueryable<Employee> query = db.Employee.Where(x => whereFunc(x));
    var employee = await query.ToListAsync();
    return employee;
}

The Expression<Func> way (just wrap the Func into Expression, pay attention to Where parameter):

Employee GetEntity(Expression<Func<Employee, bool>> whereExpr)
{
    IQueryable<Employee> query = db.Employee.Where(whereExpr);
    var employee = await query.ToListAsync();
    return employee;
}

From the caller's view, the usage are the same (the underlying mechanism are totally different):

var employee = GetEntity(x => x.Name == "Jack");

If you print out the translated SQL, the where clause comes back:

SELECT `x`.`Name`
FROM `XXX`.`Employee`
WHERE `x`.`Name` == 'Jack'

Navigation Expression

I spent some time to investigate how to transform Func<DbSet<Employee>, IQueryable<Employee>> to Expression. The hardest part is the return vaule is not fixed: you may .include(x => x.Salary), .include(x => x.Assets), both or nothing. Finally I found a way to solve it: use object as return value.

The original version:

Employee GetEntity(Func<DbSet<Employee>, IQueryable<Employee>> navigationFunc)
{
    IQueryable<Employee> query = navigationFunc(db.Employee).Where(x => x.Name == "Jack");
    var employee = await query.ToListAsync();
    return employee;
}

The Expression<Func> way:

Employee GetEntity(params Expression<Func<Employee>, object>>[] navigationExprs)
{
    IQueryable<Employee> query = db.Employee;
    foreach (var navigationExpr in navigationExprs)
    {
        query = query.Include(navigationExpr);
    }
    query = query.Where(x => x.Name == "Jack");
    var employee = await query.ToListAsync();
    return employee;
}
The usage are different:

  • Original version

    var employee = GetEntity(x => x.Include(y => y.Salary);

  • Expression version

    var employee = GetEntity(x => x.Salary);

Reference: # Passing multiple Include statements into a Repository

Put It All Together

Employee GetEntity(Expression<Func<Employee, bool>> whereExpr, params Expression<Func<Employee>, object>>[] navigationExprs)
{
    IQueryable<Employee> query = db.Employee;
    foreach (var navigationExpr in navigationExprs)
    {
        query = query.Include(navigationExpr);
    }
    
    query = query.Where(whereExpr);
    
    var employee = await query.ToListAsync();
    return employee;
}

Issues

It seems that the Expression as a parameter cannot be wrapped. If you have another function:

Employee GetEntityFunc(Func<Employee, bool> whereFunc)
{
    Expression<Func<Employee, bool>> whereExpr = x => whereFunc(x);
    return GetEntity(whereExpr);
}

var employee = GetEntityFunc(x => x.Name == "Jack");

Although it's syntactically correct, it doesn't work in entity framework as the where clause is applied locally.