Archive

Posts Tagged ‘C#’

Modelling problem domains in C# and F# – Part 2


In my last post, I illustrated how we could model a simple real-world problem using classic OO concepts such as type hierarchies, interfaces and stateful objects. In this post I want to contrast that with a functional-first approach using F#.

Discriminated Unions

In order to model the different types of Positions on a Monopoly board in C#, we used a type hierarchy with an abstract base class and derived specialisations. However, whilst F# has those same constructs, it also has the lightweight Discriminated Union, which for our problem is a much better fit: -

This gives us, in C# terms, a full class hierarchy with Position as the “abstract base class”, and the others “inheriting” from it. Each of the above lines equates to a full class in C# terms. Read that again. Each of those lines maps to a full C# class, with constructors and properties etc. The first two have a constructor that takes in a string which then gets copied into a read-only field and exposed as a get-only property. The types are immutable and have full value-based equality checking. Compile the above in F#, and then do a “go to definition” from a C# project to see just how much boilerplate this is doing for you.

Notice the following as well: -

  • We just store data for the union types that we need (Property and Station) e.g. “Old Kent Road” or “Kings Cross”. The others don’t have any unique data in them so we don’t store any.
  • There’s also no “base” commonality between the types – we can shape them however we want in an unrestricted fashion.
  • There’s no behaviour on them – just some data.

Applying behaviour to type hierarchies in F#

So given that there’s no methods on these types, how do we do stuff like getting the Name of the position that we’ve landed on in a “polymorphic”-esque manner? The answer is that we use the Match keyword to write a single function for all of the union types.

This is a key difference from how we modelled this in the OO world. Previously, this behaviour existed across all types in the domain. Now we have a single function which represents the behaviour of printing the name for all types of Position. Note how for Property and Tax we “declare” an appropriate variable inline to get at the Property Name or Tax Amount. Match is also smart enough to give compiler warnings if you miss out a type of discriminated union, so as we add new types the compiler will alert us for missing cases.

Optional behaviours on type hierarchies

Now let’s implement the equivalent for the IMovementPosition that GoToJail and Chance etc. would implement. We can do this with a function that takes in a position and returns an optional Position: -

Here we’re using F#’s Option<T> type. The calculateMove function may or may not return a new position; it’ll only do so if we landed on e.g. Go To Jail, or a Chance card that involved a movement e.g. Advance to Go. So we can return “Some” Position or None; indeed notice also the “_” match; for positions on the board that don’t apply e.g. Properties, Stations or FreeParking etc., we just return None. This is a much better alternative than null, as we’ll see below when we consume this method – now when we want to do something like the original C# “Roll” method, we can do this: -

We’re using Match again, this time on the result of the just-declared calculateMove method; if we got Some<Position> back, we print it out and then return it. Otherwise we got None, and just return the original position that we landed on. Easy. As is common with F#, there are no type annotations required in the code above – everything is inferred based on usage. Some methods are implicitly generic; others are specific for types depending on their usage; the compiler works this all out for us. The dice argument is a tuple of int and int; we’ll use that in the moveBy method, but again this is inferred.

Conclusion

Hopefully these two blog posts have illustrated some simple differences between functional and OO design that don’t involve the usual “recursion versus for loops” etc.. F# Discriminated Unions are an excellent way to quickly declare a number of types, and in tandem with the Match keyword you can define behaviours against the Union quickly and easily. Even from this simplified example, there are some fundamental differences between how we model this problem in F# to C#: -

  • Data and Behaviour are separate. Behaviour can easily be added outside of the type hierarchy, but new Types in the hierarchy affect all behaviours.
  • Functions avoid state where possible; easier to test and easier to infer effect of any given function.
  • Pattern Matching over a discriminated using with Option types allow us to easily create new behaviours without interfaces etc..
  • Option types allow us to perform “null-style” checks in a much stronger fashion using Pattern Matching.

Modelling problem domains in C# and F# – Part 1


I’m trying more and more to use F# for hobby projects etc. and finding, as usual, that there are very elegant, lightweight solutions to abstractions. This time: Monopoly.

Defining a Problem Space

A while ago I did something fairly simple in C# to simulate the Monopoly board that rolls two dice, moves a player on and records where they land. It then does this x times and shows what board spots get landed on the most. Here’s a simplified definition of the problem space: -

  • A board is made up of n positions
  • A position can be one of: -
    • a named property
    • a chance card
    • community chest card
    • station
    • free parking
    • go to jail
    • tax
    • jail
  • Some properties have movement actions associated with them. For example, landing on the Go To Jail position should move you to Jail. Landing on the Chance deck can have a number of random movements, such as moving to Go, Jail etc. etc.

Now I’m not going to write out the whole code etc. here for illustration. What I will do is give some snippets of how I defined the problem using C# constructs and how this might map into F#.

Mixing behaviour with data with Classes

So what if we wanted to do something simple like “handle a roll of the dice”? This should move to the appropriate position on the board, print it out, and also react if the position is a “movement” position like “Go To Jail”. Well, let’s first think about modelling the board. A relationship like this would in OO terms normally be defined through a type hierarchy; we might create a type for each position e.g. Property, Chance, Community Chest etc..

abstract class Position
{
   public abstract String Name { get; }
}

class Property : Position
{
   private readonly String propertyName; // e.g. Old Kent Road
   Property(string propertyName)
   {
       this.propertyName = propertyName;
   }

    public override Name { get { return this.propertyName; } }
}

class Station : Position
{
   private readonly String stationName; // e.g. Kings Cross
   Property(string stationName)
   {
       this.stationName = stationName;
   }

    public override Name { get { return this.stationName; } }
}

I’ve only implemented a few classes above as there’s just too much code to show the lot. To be honest, in reality I probably wouldn’t even bother with immutable types as above and would have just used public get/set properties; there’s just too much boilerplate.

We would also need something like an IMovementPosition for when you land on “Go to Jail” or “Chance” to move after landing to another position on the board.

class GoToJail : Position, IMovementPosition
{
    public override Name { get { return “Go To Jail”; } } // Always the same
    public Boolean CalculateMove(IEnumerator<Position> currentPosition)
    {
         MoveTo<Jail>(); // helper method that enumerates Position until we hit a Position that is of type Jail.
         return true;
    }
}

Notice how our Position types now have behaviour associated with them; getters that sometimes do logic (i.e. delegate to a private field), sometimes not. Some types have extra behaviour. Some have none. This is all classic OO that we learned years ago from Booch and his mates – behaviour and state etc.. So, given all code above, we can now do as follows to roll the dice and move.

class BoardController
{
   private IEnumerator<Position> position; // maintain current position

   public void Roll(int firstDie, int secondDie)
   {
      //hidden side effect on position
      MoveBy(firstDie, secondDie);
      Console.WriteLine(“Landed on {0}.“, position.Name);

      // see whether the position implements IMovementPosition
      var movementPosition = position as IMovementPosition;
      if (movementPosition != null)
      {
         //hidden side effect on position again, this time from another class
         if (movementPosition.CalculateMove(position))
            Console.WriteLine(“Moved to {0}”, position.Name);
      }
   }
}

Notice how we check whether the newly-landed-on position is an “IMovementPosition”, and if so we then execute code to calculate the “extra” movement e.g. Go To Jail

Conclusion

This is obviously a simplified example, and the code is more instructional to illustrate modelling some basic domain in C#. Things to note: -

  • Data and Behaviour travel together in a type.
  • State mutated throughout implicitly.
  • Extra types e.g. IMovementPosition required to indicate optional extra capabilities.

In Part 2 I’ll model the same thing in F#.

Using wrappers to aid unit testing

30 April, 2013 1 comment

As I alluded to about recently when blogging about JustMock, one of the most important attributes of unit tests has to be that they are readable; you can easily reason about them and see what they do.

I also talking about Moq’s overly cumbersome and verbose approach to performing Setups on mocks – I rarely supply arguments for setup methods on mocks, since this would be doing two tests in one i.e. mocking that we handling the result of the method, but also implicitly testing that we called the method with the correct arguments. The latter should be left for another test.

Coincidentally, I had a look at a few other frameworks recently: –

  • FSUnit, which is an F# unit testing framework that wraps around NUnit / MSTest / XUnit etc. to provide a more succinct unit test experience in F#
  • Simple.Data, which is an awesome data access layer that works over multiple data sources and uses C#’s dynamic feature to allow you to very easily generate queries etc. against data sources with the minimum of fuss.

Simplifying Moq’s Setup

This got me thinking – could we not do the same with mocking frameworks? Well, a couple of hours later, the answer is yes. Here’s a simple example of how you can set up mocks in Moq much more succinctly using a dynamic wrapper class. First the original Moq Setup method: -

[Fact]
public void Foo_GotPayroll_LogsIt()
{
    SetupClassUnderTest();
    service.Setup(svc => svc.GetPayroll(It.IsAny<Person>(), It.IsAny<Person>(), It.IsAny<Person>())).Returns("ABCDEFG");

    // Act
    classUnderTest.Foo(null, null, null);

    // Assert
    logger.Verify(l => l.Log("Got back ABCDEFG."));
}

Notice the large amount of noise from the It.IsAny<T>() calls – almost 50% of the contents of the statement are taken up by It.IsAny().

Now look at this version: -

[Fact]
public void TestFoo()
{
    SetupClassUnderTest();
    service.Setup().GetPayroll().Returns("ABCDEFG");

    // Act
    classUnderTest.Foo(null, null, null);

    // Assert
    logger.Verify(l => l.Log("Got back ABCDEFG."));
}

It uses an new extension method of Setup which operates slightly differently: -

  1. It returns a dynamic object which when called immediately seeks out any method on the mock service that are called “GetPayroll”.
  2. It then filters out any overloads that do not have the matching return type of System.String.
  3. Then, for each matched method, it parses the argument list and generates an expression which calls the method, with an appropriate It.IsAny<T>() call for every argument.

In effect, it expands into the code of the first version, but at runtime. Notice how much more succinct the code is – you don’t need to waste time with It.IsAny<T>(), or call IgnoreArguments(), or even with the lambda expression – you simply provide the name of the method you want to mock out as a parameterless method call – which is what your intent is anyway – and then call Returns on it.

You can also do the same with Throws, which will take in an Exception and setup a Moq call to .Throws(). Easy.

Conclusion

This was more an experiment to see how easy it would be to create a more succinct wrapper around Moq (I’ll put the source code up for anyone that wants it) but also to see whether it would actually work from a consumption point of view – does it feel “right” to call a dynamic method which does setup / mocking for you? Can you have confidence in it? I leave that you to to decide :-)

First experiences of Telerik’s JustMock


Problems with Moq

Having migrated from Rhino Mocks over to Moq, I have found myself lately getting more and more frustrated with the verbosity of Moq for simple assertions. I present as exhibit one the GetPayroll method, called below.

public void Foo(Person first, Person second, Person third)
{
   logger.Log("Processing data for the following users: ");
   logger.Log(first);
   logger.Log(second);
   logger.Log(third);

   var payroll = myService.GetPayroll(first, second, third);

   logger.Log(String.Format("Got back {0}.", payroll));
}

I want to assert that I call the Log method with the result of GetPayroll. So I need to arrange that when I call GetPayroll, it returns an arbitrary string that I can use to assert in the call to Log(). Here’s the Moq test to prove that we log the correct payroll string: -

[Fact]
public void Foo_GotPayroll_LogsIt()
{
   var logger = new Mock<ILogger>();
   var myService = new Mock<IMyService>();
   var classUnderTest = new ClassUnderTest(logger.Object, myService.Object);
   myService.Setup(svc => svc.GetPayroll(It.IsAny<Person>(), It.IsAny<Person>(), It.IsAny<Person>())).Returns("ABCDEFG");

   // Act
   classUnderTest.Foo(new Person(), new Person(), new Person());

   // Assert
   logger.Verify(l => l.Log("Got back ABCDEFG."));
}

Notice that I don’t care what values are passed in to the service call. Why? Because I already have another unit test that Verifies that I called this method with the correct arguments. I don’t need to test that twice (which also increases fragility of tests).
What aggravates me is the ridiculous repeated use of It.IsAny<Person>(). Imagine you had more arguments in your stubbed method (this can be the case when mocking out some BCL interfaces or other third party ones)  – your tests can quickly become unreadable, lost in the sea of It.IsAny<T> calls.

What I want is something like Rhino Mock’s IgnoreArguments() mechanism, or even better, TypeMock’s “ignore arguments by default” behaviour, which is a fantastic idea, encouraging you to only assert arguments during assertions and not during arrangement. Unfortunately, TypeMock is not available on NuGet and is a fairly heavyweight install, requiring add-ins to VS etc.. I therefore gave JustMockLite (JML) a quick go – and so far I’ve been very impressed with it.

Just Mock Lite

Just Mock Lite is a free unit testing framework from Telerik. I saw some demos of it a few months ago, but frankly was not impressed with the API in the webcast – all the demos I saw showed Record / Replay syntax. There was nothing on AAA. However, I saw it on NuGet so thought “let’s see what it’s like anyway”. Just Mock also has a full version which includes TypeMock-like features e.g. mocking statics, concretes etc.

Getting up and running

Whenever I try out a framework like this, I try to avoid reading the docs to see how friendly the API is to the complete newbie – someone who knows what to expect from a unit test framework. I don’t want to spend hours in webpages going through APIs – I want the API to be discoverable and logical. I’m happy to say that the main JustMock static class, Mock, is very easy to use, such that I was able to get up an running without resorting to the online docs until I came across some more complex situations.

However, I would like to see a slightly cut-down version of the publicly-visible namespaces for JustMock Lite that doesn’t include the types that are only available with the “full” version. There’s probably 15-20 classes and more namespaces underneath the Telerik.JustMock namespace – what are they all for? Do I as the client of the framework need to see all of them? Not sure. Perhaps some should be under an “.Advanced” namespace or something.

JML in action

Here’s a redone test of the one above using JustMockLite: -

[Fact]
public void Foo_GotPayroll_LogsIt()
{
    var logger = Mock.Create<ILogger>();
    var myService = Mock.Create<IMyService>();
    var classUnderTest = new ClassUnderTest(logger, myService);
    Mock.Arrange(() => myService.GetPayroll(null, null, null)).IgnoreArguments().Returns("ABCDEFG");

    // Act
    classUnderTest.Foo(new Person(), new Person(), new Person());

    // Assert
    Mock.Assert(() => logger.Log("Got back ABCDEFG."));
}

The main things to note are that: -

  • You don’t have the “Object” property anywhere; JustMock works as TypeMock, by having static methods that take in expressions that contain mock objects etc.. This is nice as it cuts down on the fluff of Moq’s composition approach (which is still probably a cleaner approach than Rhino’s extension methods).
  • The JML Mock static methods have intelligent names – Arrange, Assert etc. etc. – exactly what you want if you follow the AAA unit testing approach.
  • IgnoreArguments() is back. Hurrah! Now I can just put in null or whatever for arguments and postfix them with .IgnoreArguments() – all done. This is much, much more readable, quicker to author, and less fragile than Moq’s approach. But TypeMock’s approach of ignore-by-default is a better approach still.
  • What if you need to specify “some” arguments? That’s easy – it reverts to the Moq approach, except there are handy constants for common “Ignore” type arguments. These are quick to type with intellisense and take up less space than the full It.IsAny<String>() malarky: -

There are also the usual Match<T> as well as helpers on top of this like IsInRange etc. etc..

Mock.Assert(() => myService.DoStuff(Arg.AnyString, Arg.IsInRange(1, 5, RangeKind.Inclusive), Arg.IsAny<Person>()));

I was able to migrate a load of Moq tests to JustMock in about 30 minutes with the help of a couple of macros to rewrite Verify calls to Assert etc. etc. – pretty easy in fact. The API takes several pieces from Moq in terms of design although methods are of course renamed – instead of Times.x we now have Occurs.x etc. etc. – nothing to worry about.

Other features

I also noticed that JML supports call counting, which I blogged about a few weeks ago. This lets you easily say “I expect that this method was called x number of times”. Furthermore, you can chain sequences of results through an extension method in JustMock.Helpers that gives you a fluent-style chaining mechanism so you can say “return 1, then return 5, then return 10” – although I wonder how often this sort of feature would be required.

Criticisms

  • One thing that JML falls short of in is it’s ability to generate recursive mocks. Whilst JML does support limited recursion, it cannot automatically return child mocks from methods on a parent mock; nor does it have the ability to make assertions on them. Instead, you need to manually create child mocks and wire them up as the return object for the parent mock’s method. This is unfortunate, because it does add a bit of complexity to some mocking scenarios, but thankfully it’s not a common situation.
  • The API could probably be cut down a bit – there’s lots of classes in the main namespace that you will probably not often use etc..
  • The API is very powerful – probably one of the most powerful of the free mocking frameworks out there. This is of course a good, thing, but it has its pitfalls. For example, In addition to doing the standard “AAA” style mocking, it also supports the old “Record/Replay” style of unit testing whereby you can set up expectations on methods during the arrange and then simply call “Assert” at the end. I hate this way of unit testing and would have preferred not to have seen those methods at all, or at least have them as an “opt-in”. People generally write unit tests in the RR or AAA style, but don’t tend to mix and match between them – neither type of developer will want to see the other style of unit test methods.
  • No XML comments on the API. Come on guys – it just takes a few minutes to put XML comments on your API with GhostDoc and then I don’t have to resort to opening up the browser to see what the Occurs methods does on IAssertable.

Conclusion

Overall, I’m pretty happy with JML. I’ve only used it for a couple of days, so no doubt I’ve missed some things out – but so far I’m very impressed with it. It’s powerful – notwithstanding my reservations on recursive mocks, has a fairly lightweight “core” API that is easy to get up and running with, and is being actively worked on. There’s also the full version of the API which can mock all sorts of other things, so you can upgrade if required. If you’re starting a new project, I’d seriously recommend having a look at it before going down the route of Moq as you might well prefer this.

TypeScript exposes some irrational Microsoft hatred

2 October, 2012 1 comment

Just watched a couple of videos on TypeScript – basically a Microsoft-developed (but open source) superset of JavaScript which compiles into plain old JS that gives you a type-safe environment to write JS. Unlike stuff like Dart or Script#, because it’s a superset of JS, you can easily opt-in to bits you want; it integrates without any issues with existing JS like Node.js or JQuery.

Initial thoughts of TypeScript

Firstly, coming from a .NET developer background, my first thought was that TS reminds me of F#! By that I mean, first it builds on top of JS to give more functionality in a similar way that F# builds on top of the functionality of C#. Ironically some of this is accomplished by “removing” functionality e.g. static typing, just like F# introduces immutability which initially seems like a restriction but actually is a boon.

The type inference also reminds me of F# is that the output of methods is inferred from the return type from the method, or how you can define fields within the constructor directly. In fact, it makes me wish that C# had better type inference – and it just goes to show you can do more powerful type inference without more CLR support.

All in all, it looks very nice – basically gives you more intellisense, refactoring opportunities e.g. rename, lambda expressions, implicit closures etc. which are very C#-like.

Initial thoughts of TypeScript – with feeling

This is all good. Yet I’ve already read an incredible amount of emotional, uninformed spouting about it. I’m not referring to the actual articles in the links here – although the Register one isn’t quite as accurate as I’d have liked, talking about “extending JavaScript” in the headline which isn’t quite right. They’ve written a new language which itself is a superset of JS. They’re not writing an MS-only version of JS.

Yet there are many comments on the following tangent: -

  • It’s from Microsoft. Ergo, it’s crap.
  • It’s from Microsoft. Ergo, it must have some hidden nefarious purpose.
  • It’s from Microsoft. Ergo, it won’t be secure.
  • It’s just another Dart. What’s the point. Oh it must be competing with Google for the sake of it.
  • They’re trying to lock me in to the Microsoft tool-space etc. etc.
  • Static typing is crap.
  • This will just fragment the JS community by creating a different version of JS.
  • I can write JS directly, what do I need this for?

What a load of tosh! Anything that allows me to refactor a method or property name easily, or gives me intellisense for exploring an API is definitely a good thing. It doesn’t affect the runtime – it’s still spitting out plain JS. It doesn’t require me to learn lots of new things if I’m a JS developer. The language builds on top of JS – so you can opt-in for the bits that you want. It’s open source. What’s the problem?

Of course – it’s written by Microsoft.

Another plug for F#…

26 September, 2012 Leave a comment

The majority of .NET people I work with are C# developers. There are some VB guys I know and even a few Java people.

I also know a few F# folks from my time when I went on a SkillsMatter course last year and although I don’t use the language on a day-to-day basis, I still dip into it now and then, particularly if I find myself trying to write a component which acts in a functional sense e.g. takes in some data and returns a result.

I’ve been playing a little with F#3.0 lately and have been really, really impressed with the data access and query elements that they’ve added to the language. The two main things that I’ve noticed are Query Expressions and Type Providers .

Query Expressions

Query Expressions bring LINQ-type query syntax directly in F#. To be fair, F# already had the extension method style syntax for pipelining data (which I happen to prefer) with the |> operator, but you can now write LINQ queries directly e.g. given a type customer you could previously write a query as follows:

let results = customers
                  |> Seq.filter (fun customer -> customer.Age < 21)
                  |> Seq.map (fun customer -> customer.Surname)
                  |> Seq.distinct

But in F#3.0 you can write it like so: -

let results  = query { for customer in customers do
                       where (customer.Age < 21)
                       select customer.Surname
                       distinct }

Personally I don’t mind the pipeline style – in some ways I’m more used to that.

The real difference between F# and C#’s query operators is the sheer volume of them that are directly supported in F# compared to C#. In addition to the normal where, select and distinct etc., there are are also ones such as all, average, first, skip and take etc. etc. all directly in the language itself – so you essentially get a much richer set of query operators directly supported in the F# language as opposed to just in the framework as a set of method calls.

Type Providers

Type Providers are a great way of consuming data from distant sources without having to manually create (or code-gen) types to handle the data coming in. Think about how you handle data from a web service, database or CSV file – typically at best you’ll use a code gen to create your proxy types, or at worst you create some POCOs by hand and manually do the mapping.

Type Providers give us a way to, at code-time (not compile time!), point e.g. a web service to a URI and automatically give us a strongly-typed dataset e.g. in the example below, I’m connecting to the Netflix OData endpoint: -

type odata = ODataService<"http://odata.netflix.com/v2/Catalog/">

I can now get my context from this type and go right ahead and start interrogating the model. There’s no need for an “add service reference” etc. which code-gens – this is all happening at edit-time, immediately after I finished typing the line above.

image

And Type Providers are extensible so you can create your own ones as you see fit; there are now type providers for Entity Framework models, L2S, CSV etc. etc – nice.

Conclusion

F#3.0 brings some very nice data access points directly into the language. Again, to me F# is growing very quickly, and very nicely, to the point where it starts to make C# look a little like it’s standing still. Of course, the two languages target very different domains – C# is a general purpose OO language (which in recent years has brought in some functional constructs), and it’s always going to target a larger audience than F#. As a result, it has different aims and goals as a language than F#.

Nonetheless, there are times where I wish C# had the type inference from F#, or it’s pattern matching capabilities, or its support for helping you write functionally pure code. If you’re at all interested in broadening your views on programming, I would strongly recommend you spend a bit of time looking at F#, if for no other reason than you might improve the way you write your C#. It’ll also give you a better understanding of what a compiler does – you’ll start to realise just what the C# compiler does (or doesn’t do Smile) when you hit build.

Tags: ,

Using Aggregate in LINQ


The System.Linq namespace has a load of useful extension methods like Where, Select etc. etc. that allow us to chain up bits of code that operate over sequences of data, allowing us to apply functional-style programming to our data.

There is one method which is often overlooked yet it is probably the one that lends itself best to functional programming is the Aggregate() method. Unlike methods such as a Select, which, given a set of n items, projects a set of n other items, Aggregate can be used as a way of merging a collection of items into a different number of items. Indeed, some LINQ methods can be implemented easily with aggregate, such as Sum: -

image

image

The syntax looks a bit bizarre, especially when you look at the function signature of the method (including the overloads), but essentially the method takes in a function which itself takes in two values and returns another: -

  • accumulator, which is an arbitrary object which is passed through every item in the collection. In the default overload of Aggregate, this is the same type as the source collection e.g. Int32.
  • value, which is the next value in the chain e.g. 1, then 2, then 3 etc.
  • result, which will become the accumulator in the next iteration

So, if we were to expand the above bit of code with debugging statements etc., it would look something like this: -image

Note that with the default function overload, the initial value of the accumulator is the first value in the collection (1), aka the “seed” value.

More complex uses of Aggregate

Let’s say we wanted to print a single string out which is all of the numbers separated by a space e.g. “1 2 3 4 5 6 7 8 9 10”. Common LINQ methods wouldn’t be appropriate for this. You could use Select to get the string representation, but would get a sequence of 10 strings rather than a single one. You might now fall back to foreach loops etc., but this is where Aggregate is useful: -

image

This overload of Aggregate takes in two arguments – the first is a “seed value” which will be the initial value of the accumulator, in our case an empty String. Every iteration takes the accumulator, appends the next number to it and returns the resultant String as the next accumulator, which gives us the following (debug statements added): -

image

Simples! (obviously in a real world example you might use a StringBuilder as your Accumulator instead).

Notice how in this example we didn’t return the same type as the collection that we operated over (i.e. Int32). We can use this technique to do all sorts of funky things to collections that you might not have considered before.

Conclusion

Aggregate is a rarely used but extremely powerful LINQ method. In my next post, I’ll build on this showing some more powerful (and perhaps useful!) examples of Aggregate.

Tags: ,

LINQ in C#2

29 January, 2012 Leave a comment

Introduction

Continuing my series of posts on LINQ, I wanted to give a simple example as to how one can get the same sort of functionality in terms of query composition and lazy evaluation by using the yield keyword and without using any of C#3’s features. Bear in mind that LINQ was introduced as part of .NET 3.5, which itself runs on the same CLR as .NET 2. So everything that happens with LINQ is “just” a set of compiler tricks and syntactic sugar etc. – at runtime there’s nothing that happens that can’t be done manually with C#2.

Here’s the task we’ll tackle: Get the next 5 dates that fall on a weekend.

Streaming data

In purely non-LINQ terms, we could easily carry out this operation as a while loop, bespoke for the problem at hand. However, this wouldn’t offer any of the benefits that an API like LINQ offers e.g. composability and reusability of operations, which is what we’re trying to achieve – so let’s assume we’re trying to use a query-style mechanism; also, we want to try to create something more like an Date query API that we could use to write other, similar queries in future.

In C#3 using LINQ we might use Where() to filter out non-weekend days and then Take() to retrieve ten items. But there’s an initial challenge that we encounter when trying to do this query with LINQ – what exactly do we query over – what set of data do we operate over? There’s no static “All Dates” property in .NET, and we don’t know in advance the set of dates to query over. This is where yield comes in. It allows us to easily create sequences of data that can be generated at run-time and queried over.

Take a look at this: -

image

Ignore the DisposableColour class – it just temporarily changes the colour of the console foreground. What’s more important is that this method returns something that masquerades as a collection of DateTimes – when in reality we’re generating a infinite stream of DateTimes, starting at the date argument provided. This collection has no end and you can never ToList() it to fully execute it. Well, you could try, but it would keep going until DateTime.Max is reached. It simply generates dates on demand starting from the provided date.

Implementing composable query methods

Given this stream, we can write two other methods which firstly filter out dates that do not match Saturday or Sunday and another one which will only “take” a number of items from the sequence and then stop: -

image

Notice with the above method, we only yield out dates that match the provided days required, otherwise it doesn’t give back anything and implicitly iterates to the next item. Next, here’s a generic implementation of Take. It returns the next item in the collection, and then when it has returned the required number of items, breaks the foreach loop.

image

Consuming composable methods

Imagine that the methods above lived in a self-contained API that allowed us to easily query DateTimes – here’s how we could use it to answer our original question: -

image

All we do is generate the stream of DateTimes and then pipe them through the two other methods. The beauty of this is that because we’re yielding everything from the first method to the last, we only generate DateTimes that are required.

The key is the CreateDateStream method i.e. the stream of Dates. We cannot generate "every” date in advance – that would be grossly inefficient; it’s much better to dynamically create a stream as required.

  • dateStream is a stream of all dates starting from DateTime.Today.
  • weekendDays is the filtered stream of dates from dateStream that fall on saturday or sunday
  • nextUpcomingWeekendDays is the stream of the first 5 items from weekendDays

If we run the code above, we get the following output: -

image

Look at the messages in more detail. We only created enough dates until we matched 5 weekend days. Only those dates that fall into the required filter criteria get streamed into the Take() method, and only those fall out into Main. When we’ve taken enough, Take() breaks the loop which ends the foreach.

Conclusion

Yield is one of the key enablers for writing lazily-evaluated queries and collections. Without it, your queries would be less composable as well as less efficient; streaming out data as required allows us to only generate that part of the data that we still require.

In our example, we could get the next ten upcoming days without filtering, because Take also operates on IEnumerable<DateTime> – we can simply chain up our methods as and where needed.

Lastly – we could easily change the signatures of the three API methods to make them Extension Methods to give us a more LINQ-style DSL. It looks more like LINQ, but it’s still exactly the same code: -

image

If you’re struggling with this, it might help you to write out the code yourself and step through it with the debugger to see the actual flow of messages, or try creating some simple yield-style collections yourself.

Tags: , ,

Psychic LINQ

28 January, 2012 2 comments

A relatively short post on cargo cult programming, particularly related to LINQ.

LINQ is a fantastic technology. The idea of making a platform-agnostic query language is a fantastic idea. You can write the same query, in C#, over an in-memory list or a database and from the client point of view treat it in the same way. Isn’t it wonderful!

I’ve recently carried out a number of job interviews where candidates had to answer the following question:

If you wanted to find all Customers whose name was “Isaac”, why would you use a .Where() clause over a collection of T rather than using a foreach loop and manually construct the result set?

The results were varied. What I was looking for was a discussion of the benefits of declarative versus imperative code; what versus how etc.; composability of queries etc.

Strangely enough, the most common answer I got was "LINQ is faster than a foreach loop". Why? Either because LINQ somehow "optimises" code to make it faster, or because it "doesn’t need to loop through a collection – it just ‘does’ a where clause over the collection so that it instantly finds it". Almost as if C# is doing magic! In both cases the candidates could not justify their beliefs with evidence – it was just their feeling that that “must” be the case.

Now, lets talk about the reality. I would direct everyone to Jon Skeet’s fantastic EduLinq blog series to get an in-depth understanding of how LINQ over objects works, but always remember this simplification:

The only methods and properties that LINQ has for all its query methods are sourced from IEnumerator <T>

  • Boolean MoveNext();
  • T Current { get; }

    That’s it. Think about that. There is no magic involved. If you do a Where() clause over a collection, you will enumerate the entire source. There is no “pixie dust” that will give LINQ the answer quicker than a foreach loop and an if / then statement – and bear in mind, foreach loops are just syntactic sugar over IEnumerator, just like the using statement wraps IDisposable.

Tags: , ,

Let there be LINQ

25 January, 2012 Leave a comment

Just a quick post regarding use of the let keyword in LINQ, which I find to be somewhat under-used by many people. Whilst one benefit of it can be readability (i.e. aliasing sections of complex queries to aid understanding of the query), the other benefit can be performance.

There is indeed a cost associated with using it i.e. every time you use it, you’re effectively creating a new anonymous type to hold that plus whatever the previous result in your query pipeline. So if you chain up lots of lets in a query, that’ll have an impact on the query. However, there is a case where let can give large performance benefits: -

image

Compare that code with the following: -

image

This will eliminate a massive number of calls to Convert.ToInt32() and reduce the time taken to process that query by around 40%; the former sample took ~1400ms to run whereas the latter took only around 800ms.

Tags: , ,
Software by Default

Open Source using Microsoft C#.net

Paul Thurrott's SuperSite for Windows

Spreading the Gospel of Isaac

ScottGu's Blog

Spreading the Gospel of Isaac

Jon Skeet: Coding Blog

Spreading the Gospel of Isaac

Search Msdn

Spreading the Gospel of Isaac

elastacloud = azurecoder + bareweb

The Official Elastacloud Blog for Happy Times in the Cloud

robin osborne

Spreading the Gospel of Isaac

Fabulous Adventures In Coding

Spreading the Gospel of Isaac

Follow

Get every new post delivered to your Inbox.

Join 174 other followers