The cost of creating Expression Trees


First, merry xmas etc. etc..

Second – expression trees. I just wanted to post a short blog about the actual cost of creating trees that can be illustrated by comparing the following two code samples: –

image

image

See the difference? Rather than creating the expression inline of the for loop (Sample A), we create it once outside the for loop and reuse it for all iterations (Sample B). (FYI: ParseExpression() does absolutely nothing – it’s just there as a means to an end). Here’s the timing results for 1,000,000 loops: –

image

Running the performance tuning wizard against this shows us some cheeky reflection calls behind-the-scenes which even brings up a performance warning: –

image

image

The lesson to be learned? Expression Trees are an extremely useful tool in your arsenal – and can be used to give big performance gains over e.g. reflection or even dynamic, but ensure that you cache the expressions themselves otherwise the cost of creating the tree can outweigh the benefits gained from them.

Performance or readability with Expression Trees in C#


I was fumbling around trying to create some expression trees on Friday and was working through some stuff with a colleague of mine (someone who actually knows how to do it better than me!) and I got to thinking about the characteristics of different ways of doing the same things in C# with respect to both runtime performance and readability / ease of development.

Let’s say we have the following type hierarchy:

image

Easy enough. Now we want, for some arbitrary reason, to write a generic method that will set a specific integer property to the value of 25, but we don’t know which one it is. In our example above, that could be both the Age of the employee, or the AddressId. You could do this with reflection: –

image

image

Expression Trees

Nice. This works well, no problems whatsoever. There is another, strongly typed way of doing it though, with expression trees. By passing in an expression tree that navigates to that property, we can, at runtime, construct a Action method specific to what we want, and then call that method as needed. Here’s how we would consume such a method: –

image

ageSetter and addressIdSetter are two delegates that we are constructing at runtime to perform the task of setting the appropriate property to the value of 25. The main benefit of such an approach is obvious i.e. strong typing. But there’s another, more subtle benefit of such an approach – performance. To understand why this is, we need to understand how this method is implemented.

CreatePropertySetter returns an Action<T> which, when called, will set the appropriate property to 25: (I’ve elided the GetMembersEnumerator method for clarity)

image

To make it a bit easier to see what happens, when we call the method with e => e.Age, here’s what the above method looks like with debugger quickwatches pinned: –

image

By compiling the final expression, we get a proper Action method which is the same as instance => instance.Age = 25, except we’ve constructed this code, at runtime.

I find Expression Trees difficult to get my head around. I understand the idea of them – essentially building up a tree of things like “assign property” and “call method” and “if” etc. which can then be compiled into a lambda. But the Expression API has lots of factory methods and many ways to do the same thing; I guess with time you gain familiarity with the API, but it’s definitely not the easiest thing in the world to get your head around. Worse still, look at the code required to do the same thing as that reflection code – much more effort and much less readable.

However, where performance is concerned, Expression Trees will beat Reflection by miles – as long as you cache the Action method that is generated! The construction and parsing of the expression trees required to create the Action is an expensive operation, so you should cache the Actions. Once you’ve done that, the cost is extremely low, as you are simply calling an action method, almost the same as if you had written it yourself. For example, to set both the Age and AddressId properties to 25, for 2.5 million Employees, I observed the following timings: –

Type of code applied Time required (ms)
Direct assignment

144

Reflection

3,881

Expression Trees

349

Dynamic

597

Hard-coded lambdas

136

Obviously the other three options wouldn’t be appropriate for the problem at hand (generic property assignment) but I wanted to illustrate how these two mechanisms performed. By the way – notice how much quicker dynamic is than reflection (and these timings were obtained when caching the property info objects as well). Funnily enough, if I wrote hard coded lambdas e.g. instance => instance.Age = 25 and used them, they outperformed code like instance.Age = 25. Why?

Conclusion

When you next use reflection for some property assignments or method calls, think about using Expression Trees, particularly where performance is a factor. Expression Trees are time consuming to write and understand, but offer superior performance and can be consumed in a strongly-typed manner. Alternatively, consider the use of dynamic where possible – again, for property setters and getters and method calls it offers better performance than reflection, and whilst obviously not strongly-typed, the code is again far more readable.