Continuing my series of posts on LINQ, I wanted to give a simple example as to how one can get the same sort of functionality in terms of query composition and lazy evaluation by using the yield keyword and without using any of C#3’s features. Bear in mind that LINQ was introduced as part of .NET 3.5, which itself runs on the same CLR as .NET 2. So everything that happens with LINQ is “just” a set of compiler tricks and syntactic sugar etc. – at runtime there’s nothing that happens that can’t be done manually with C#2.
Here’s the task we’ll tackle: Get the next 5 dates that fall on a weekend.
In purely non-LINQ terms, we could easily carry out this operation as a while loop, bespoke for the problem at hand. However, this wouldn’t offer any of the benefits that an API like LINQ offers e.g. composability and reusability of operations, which is what we’re trying to achieve – so let’s assume we’re trying to use a query-style mechanism; also, we want to try to create something more like an Date query API that we could use to write other, similar queries in future.
In C#3 using LINQ we might use Where() to filter out non-weekend days and then Take() to retrieve ten items. But there’s an initial challenge that we encounter when trying to do this query with LINQ – what exactly do we query over – what set of data do we operate over? There’s no static “All Dates” property in .NET, and we don’t know in advance the set of dates to query over. This is where yield comes in. It allows us to easily create sequences of data that can be generated at run-time and queried over.
Take a look at this: –
Ignore the DisposableColour class – it just temporarily changes the colour of the console foreground. What’s more important is that this method returns something that masquerades as a collection of DateTimes – when in reality we’re generating a infinite stream of DateTimes, starting at the date argument provided. This collection has no end and you can never ToList() it to fully execute it. Well, you could try, but it would keep going until DateTime.Max is reached. It simply generates dates on demand starting from the provided date.
Implementing composable query methods
Given this stream, we can write two other methods which firstly filter out dates that do not match Saturday or Sunday and another one which will only “take” a number of items from the sequence and then stop: –
Notice with the above method, we only yield out dates that match the provided days required, otherwise it doesn’t give back anything and implicitly iterates to the next item. Next, here’s a generic implementation of Take. It returns the next item in the collection, and then when it has returned the required number of items, breaks the foreach loop.
Consuming composable methods
Imagine that the methods above lived in a self-contained API that allowed us to easily query DateTimes – here’s how we could use it to answer our original question: –
All we do is generate the stream of DateTimes and then pipe them through the two other methods. The beauty of this is that because we’re yielding everything from the first method to the last, we only generate DateTimes that are required.
The key is the CreateDateStream method i.e. the stream of Dates. We cannot generate "every” date in advance – that would be grossly inefficient; it’s much better to dynamically create a stream as required.
dateStream is a stream of all dates starting from DateTime.Today.
weekendDays is the filtered stream of dates from dateStream that fall on saturday or sunday
nextUpcomingWeekendDays is the stream of the first 5 items from weekendDays
If we run the code above, we get the following output: –
Look at the messages in more detail. We only created enough dates until we matched 5 weekend days. Only those dates that fall into the required filter criteria get streamed into the Take() method, and only those fall out into Main. When we’ve taken enough, Take() breaks the loop which ends the foreach.
Yield is one of the key enablers for writing lazily-evaluated queries and collections. Without it, your queries would be less composable as well as less efficient; streaming out data as required allows us to only generate that part of the data that we still require.
In our example, we could get the next ten upcoming days without filtering, because Take also operates on IEnumerable<DateTime> – we can simply chain up our methods as and where needed.
Lastly – we could easily change the signatures of the three API methods to make them Extension Methods to give us a more LINQ-style DSL. It looks more like LINQ, but it’s still exactly the same code: –
If you’re struggling with this, it might help you to write out the code yourself and step through it with the debugger to see the actual flow of messages, or try creating some simple yield-style collections yourself.