Lightweight callsite caching with F# and C#


What is Memoization?

Just a simple demo outlining how you might write a simple generic callsite cache in F# aka Memoize. This is something that you can find elsewhere on the web – it’s easy to write one. What I really want to illustrate is if you were to write something similar in C# how much more boilerplate it would be. I’ve actually written a more fully-featured cache Call Handler for Unity which works with any number of arguments, but trust me when I say that it was a fair amount of work to do. And when you’re working with mutable data structures it’s difficult to know what you can do with them with regards to e.g. putting them in hash tables etc. etc. (as hash values on mutable objects can potentially change over time, but not so with immutable structures).

Memoize in C#

Anyway… here’s an example in C# of a higher-order function that takes in a given function and will wrap the call in a cache, and first some sample calling code to show how we consume it.

Ouch! This isn’t the most readable code in the world (although I tried my best :-)) On the client, we need to explicitly supply the generic type arguments so that the Memoize function knows what type of cache to create. I had hoped that the compiler could infer this based on the signature of the method being supplied, but sadly not. Also, because the cache we’re using just works on a single argument, we have to supply all values as a single Tuple, so rather than just calling Add(5,10), we have to call Add(Tuple.Create(5,10)). This isn’t great. We could try to change the way the cache works to take in multiple arguments, but there would be limitations and it wouldn’t be truly generic.

The implementation of the cache isn’t that much better. We’re lost in a sea of TArgs and TResults for the method signature, and also have one of the dreaded out parameters for our TryGetValue from the cache. Otherwise it’s fairly bland – see if the value is in the cache; if it is, return it, otherwise call the supplied “real” code, and cache that result before passing it back out. Pretty basic chain of responsibility.

Memoize in F#

Here’s nearly the same code but in good old F#!

So this code boils down to pretty much the same as the C# sample, except that all the fluff is gone. From the caller’s point of view, we declare our add function, and then simply call memoize and pass it in. No need for generic arguments, as the F# compiler is smart enough to infer them. We also create a Tuple in F# with syntax that appears to be more like C# method arguments i.e. (5,10). This is succinct, lightweight and easily readable.

For the implementation of the cache, it’s also much cleaner. Firstly, there are no generic type arguments. In fact, there are no explicit types at all except for the Dictionary. F# also handles “out” parameters much more elegantly than C#, so TryGetValue can be easily called and consumed as a single expression.

Conclusion

This was just a fairly short and simple code sample illustrating how type annotations can sometimes quickly get out of control. Having automatic generalisation of functions lets us concentrate on the core functionality of what we’re trying to achieve – the F# version is around 50% of the size of the C# one, but does the same thing.  There’s also an example of how nicely F# interoperates with .NET’s out parameters by returning them as part of a Tuple.

Pigs in Lipstick


A “fluffy” post today, that doesn’t talk about F# or EF or RavenDB etc. but about general development processes. Something I think all developers need to learn, and are always honing, is the ability to trust their instinct when something feels wrong.

What do I mean by this? Well, primarily this is about identifying when you’re working on a piece of code that e.g.: –

  • takes longer than it should to write (or test!)
  • is overly complicated
  • involves copious amounts of boilerplate

These sorts of smells are things that we need to learn to identify as early as possible – ideally when writing it the first time, or at least before the whole team is exposed to a particular paradigm or pattern. By then it’s a costly exercise to fix because of the amount of “wrong” code that’s been written, and the re-training exercise that will occur throughout the team.

Identifying smells

I think that when you make mistakes within software development, that’s when you learn the most – but you can only learn from those mistakes if you identify them (or someone identifies them for you). When I try to model a problem in code, often in my head I’ll go through a simple decision tree where I just knock off the solutions that I don’t think will “work” well and the ones that are left usually will do the job e.g.

  • can I use a factory to make this easier?
  • should I split this class into two… it seems like it’s violating SRP?
  • can I use the Strategy Pattern to simplify the boilerplate and duplicate code in these classes?

More than that though, is I think having confidence in your instincts – if what you’re working on “feels” wrong, it probably is. Or to put it another way – if it quacks like crap code, walks like crap code and swims like crap code – it’s probably crap code.

Summary

Take 10 minutes in your day to review what you’re doing (or get a colleague to do it) before you get bogged down into the depths of the problem and lose perspective. I can’t stress enough how important it is to have early code reviews / pair programming sessions when developing – this is when you can shift between one approach and another as cheaply as possible. Otherwise, you’ll spend three or four days on something only for the code reviewer to say “mmmm… there’s a much better way of doing this in half the time”. You’ll most likely feel defensive that someone has essentially said you’ve wasted a few days on something, and they’ll feel awkward about saying it. It doesn’t matter whether you’re using a fantastic technology; dressing up crap code in an awesome language or framework will not save you – so spot these pigs in lipstick early, and save your team some money.

Modelling problem domains in C# and F# – Part 2


In my last post, I illustrated how we could model a simple real-world problem using classic OO concepts such as type hierarchies, interfaces and stateful objects. In this post I want to contrast that with a functional-first approach using F#.

Discriminated Unions

In order to model the different types of Positions on a Monopoly board in C#, we used a type hierarchy with an abstract base class and derived specialisations. However, whilst F# has those same constructs, it also has the lightweight Discriminated Union, which for our problem is a much better fit: –

This gives us, in C# terms, a full class hierarchy with Position as the “abstract base class”, and the others “inheriting” from it. Each of the above lines equates to a full class in C# terms. Read that again. Each of those lines maps to a full C# class, with constructors and properties etc. The first two have a constructor that takes in a string which then gets copied into a read-only field and exposed as a get-only property. The types are immutable and have full value-based equality checking. Compile the above in F#, and then do a “go to definition” from a C# project to see just how much boilerplate this is doing for you.

Notice the following as well: –

  • We just store data for the union types that we need (Property and Station) e.g. “Old Kent Road” or “Kings Cross”. The others don’t have any unique data in them so we don’t store any.
  • There’s also no “base” commonality between the types – we can shape them however we want in an unrestricted fashion.
  • There’s no behaviour on them – just some data.

Applying behaviour to type hierarchies in F#

So given that there’s no methods on these types, how do we do stuff like getting the Name of the position that we’ve landed on in a “polymorphic”-esque manner? The answer is that we use the Match keyword to write a single function for all of the union types.

This is a key difference from how we modelled this in the OO world. Previously, this behaviour existed across all types in the domain. Now we have a single function which represents the behaviour of printing the name for all types of Position. Note how for Property and Tax we “declare” an appropriate variable inline to get at the Property Name or Tax Amount. Match is also smart enough to give compiler warnings if you miss out a type of discriminated union, so as we add new types the compiler will alert us for missing cases.

Optional behaviours on type hierarchies

Now let’s implement the equivalent for the IMovementPosition that GoToJail and Chance etc. would implement. We can do this with a function that takes in a position and returns an optional Position: –

Here we’re using F#’s Option<T> type. The calculateMove function may or may not return a new position; it’ll only do so if we landed on e.g. Go To Jail, or a Chance card that involved a movement e.g. Advance to Go. So we can return “Some” Position or None; indeed notice also the “_” match; for positions on the board that don’t apply e.g. Properties, Stations or FreeParking etc., we just return None. This is a much better alternative than null, as we’ll see below when we consume this method – now when we want to do something like the original C# “Roll” method, we can do this: –

We’re using Match again, this time on the result of the just-declared calculateMove method; if we got Some<Position> back, we print it out and then return it. Otherwise we got None, and just return the original position that we landed on. Easy. As is common with F#, there are no type annotations required in the code above – everything is inferred based on usage. Some methods are implicitly generic; others are specific for types depending on their usage; the compiler works this all out for us. The dice argument is a tuple of int and int; we’ll use that in the moveBy method, but again this is inferred.

Conclusion

Hopefully these two blog posts have illustrated some simple differences between functional and OO design that don’t involve the usual “recursion versus for loops” etc.. F# Discriminated Unions are an excellent way to quickly declare a number of types, and in tandem with the Match keyword you can define behaviours against the Union quickly and easily. Even from this simplified example, there are some fundamental differences between how we model this problem in F# to C#: –

  • Data and Behaviour are separate. Behaviour can easily be added outside of the type hierarchy, but new Types in the hierarchy affect all behaviours.
  • Functions avoid state where possible; easier to test and easier to infer effect of any given function.
  • Pattern Matching over a discriminated using with Option types allow us to easily create new behaviours without interfaces etc..
  • Option types allow us to perform “null-style” checks in a much stronger fashion using Pattern Matching.

Modelling problem domains in C# and F# – Part 1


I’m trying more and more to use F# for hobby projects etc. and finding, as usual, that there are very elegant, lightweight solutions to abstractions. This time: Monopoly.

Defining a Problem Space

A while ago I did something fairly simple in C# to simulate the Monopoly board that rolls two dice, moves a player on and records where they land. It then does this x times and shows what board spots get landed on the most. Here’s a simplified definition of the problem space: –

  • A board is made up of n positions
  • A position can be one of: –
    • a named property
    • a chance card
    • community chest card
    • station
    • free parking
    • go to jail
    • tax
    • jail
  • Some properties have movement actions associated with them. For example, landing on the Go To Jail position should move you to Jail. Landing on the Chance deck can have a number of random movements, such as moving to Go, Jail etc. etc.

Now I’m not going to write out the whole code etc. here for illustration. What I will do is give some snippets of how I defined the problem using C# constructs and how this might map into F#.

Mixing behaviour with data with Classes

So what if we wanted to do something simple like “handle a roll of the dice”? This should move to the appropriate position on the board, print it out, and also react if the position is a “movement” position like “Go To Jail”. Well, let’s first think about modelling the board. A relationship like this would in OO terms normally be defined through a type hierarchy; we might create a type for each position e.g. Property, Chance, Community Chest etc..

abstract class Position
{
   public abstract String Name { get; }
}

class Property : Position
{
   private readonly String propertyName; // e.g. Old Kent Road
   Property(string propertyName)
   {
       this.propertyName = propertyName;
   }

    public override Name { get { return this.propertyName; } }
}

class Station : Position
{
   private readonly String stationName; // e.g. Kings Cross
   Property(string stationName)
   {
       this.stationName = stationName;
   }

    public override Name { get { return this.stationName; } }
}

I’ve only implemented a few classes above as there’s just too much code to show the lot. To be honest, in reality I probably wouldn’t even bother with immutable types as above and would have just used public get/set properties; there’s just too much boilerplate.

We would also need something like an IMovementPosition for when you land on “Go to Jail” or “Chance” to move after landing to another position on the board.

class GoToJail : Position, IMovementPosition
{
    public override Name { get { return “Go To Jail”; } } // Always the same
    public Boolean CalculateMove(IEnumerator<Position> currentPosition)
    {
         MoveTo<Jail>(); // helper method that enumerates Position until we hit a Position that is of type Jail.
         return true;
    }
}

Notice how our Position types now have behaviour associated with them; getters that sometimes do logic (i.e. delegate to a private field), sometimes not. Some types have extra behaviour. Some have none. This is all classic OO that we learned years ago from Booch and his mates – behaviour and state etc.. So, given all code above, we can now do as follows to roll the dice and move.

class BoardController
{
   private IEnumerator<Position> position; // maintain current position

   public void Roll(int firstDie, int secondDie)
   {
      //hidden side effect on position
      MoveBy(firstDie, secondDie);
      Console.WriteLine(“Landed on {0}.“, position.Name);

      // see whether the position implements IMovementPosition
      var movementPosition = position as IMovementPosition;
      if (movementPosition != null)
      {
         //hidden side effect on position again, this time from another class
         if (movementPosition.CalculateMove(position))
            Console.WriteLine(“Moved to {0}”, position.Name);
      }
   }
}

Notice how we check whether the newly-landed-on position is an “IMovementPosition”, and if so we then execute code to calculate the “extra” movement e.g. Go To Jail

Conclusion

This is obviously a simplified example, and the code is more instructional to illustrate modelling some basic domain in C#. Things to note: –

  • Data and Behaviour travel together in a type.
  • State mutated throughout implicitly.
  • Extra types e.g. IMovementPosition required to indicate optional extra capabilities.

In Part 2 I’ll model the same thing in F#.

Why Entity Framework renders the Repository pattern obsolete?


A post here on a pattern I thought was obsolete yet I still see cropping up in projects using EF time and time again…

What is a Repository?

The repository pattern – to me – is just a form of data access gateway. We used it to provide both a form of abstraction above the details of data access, as well as to provide testability to your calling clients, e.g. services or perhaps even view models / controllers. A typical repository will have methods such as the following:-

interface IRepository
{
    T GetById(Int32 id);
    T Insert(T item);
    T Update(T item);
    T Delete(T item);
}

interface ICustomerRepository : IRepository
{
    Customer GetByName(String name);
}

And so on. You’ll probably create a Repository<T> class which does the basic CRUD work for any <T>. Each one of these repositories will delegate to an EF ObjectContext (or DbContext for newer EF versions), and they’ll offer you absolutely nothing. Allow me to explain…

Getting to EF data in Services

Let’s illustrate the two different approaches with a simple example service method that gets the first customer whose name is an arbitrary string. In terms of objects and responsibilities, the two approaches are somewhat different. Here’s the Repository version: –

public class Service
{
    private readonly ICustomerRepository customerRepository;
    public Customer GetCustomer(String customerName)
    {
        return customerRepository.GetByName(customerName);
    }
}
public class CustomerRepository : ICustomerRepository
{
    private readonly DatabaseContext context;
    public Customer GetByName(string customerName)
    {
        return context.Customers.First(c => c.Name == customerName);
    }
}

Using the Repository pattern, you generally abstract out your actual query so that your service does any “business logic” e.g. validation etc. and then orchestrates repository calls e.g. Get customer 4, Amend name, Update customer 4 etc. etc.. You’ll also invariably end up templating (which if you read my blog regularly you know I hate) your Repositories for common logic like First, Where etc.. – all these methods will just delegate onto the equivalent method on DbSet.

If you go with the approach of talking to EF directly, you enter your queries directly in your service layer. There’s no abstraction layer between the service and EF.

public class ServiceTwo
{
    private readonly DatabaseContext context;

    Customer GetCustomer(String customerName)
    {
        return context.Customers.First(c => c.Name == customerName);
    }
}

So there’s now just one class, the service, which is coupled to DatabaseContext rather than CustomerRepository; we perform the query directly in the service. Notice also that Context contains all our repositories e.g. Customers, Orders etc. as a single dependency rather than one per type. Why would we want to do this? Well, you cut out a layer of indirection, reduce the number of classes you have (i.e. the whole Repository hierarchy vs a fake DbContext + Set), making your code quicker to write as well as easier to reason about.

Aha! Surely now we can’t test out our services because we’re coupled to EF! And aren’t we violating SRP by putting our queries directly into our service? I say “no” to both.

Testability without Repository

How do we fix the first issue, that of testability? There are actually many good examples online for this, but essentially, think about this – what is DbContext? At it’s most basic, it’s a class which contains multiple properties, each implementing IDbSet<T> (notice – IDbSet, not DbSet). What is IDbSet<T>? It’s the same thing as our old friend, IRepository<T>. It contains methods to Add, Delete etc. etc., and in addition implements IQueryable<T> – so you get basically the whole LINQ query set including things like First, Single, Where etc. etc.

Because DBSet<T> implements the interface IDbSet<T>, you can write your own one which uses e.g. in-memory List<T> as a backing store instead. This way your service methods can work against in-memory lists during unit tests (easy to generate test data, easy to prove tests for), whilst going against the real DBContext at runtime. You don’t need to play around with mocking frameworks – in your unit tests you can simply generate fake data and place them into your fake DBSet lists.

I know that some people whinge about this saying “it doesn’t prove the real SQL that EF will generate; it won’t test performance etc. That’s true – however, this approach doesn’t try to solve that – what it does try to do is to remove the unnecessary IRepository layer and reduce friction, whilst improving testability – for 90% of your EF queries e.g. Where, First, GroupBy etc., this will work just fine.

Violation of SRP

This one is trickier. You ideally want to be able to reuse your queries across service methods – how do we do that if we’re writing our queries inline of the service? The answer is – be pramatic. If you have a query that is used once and once only, or a few times but is a simple Where clause – don’t bother refactoring for reuse.

If, on the other hand you have a large query that is being used in many places and is difficult to test, consider making a mockable query builder that takes in an IQueryable, composes on top of it and then returns another IQueryable back out. This allows you to create common queries yet still be flexible in their application – whilst still giving you the ability to go directly to your EF context.

Conclusion

Testability is important when writing EF-based data-driven services. However, the Repository pattern offers little when you can write your services directly against a testable EF context. You can in fact get much better testability from an service-with-an-EF-context based approach than just with a repository, as you can test out your LINQ queries against a fake context, which at least proves your query represents what you want semantically. It’s still not a 100% tested solution, because your code does not test out the EF IQueryable provider – so it’s important that you still have some form of integration and / or performance tests against your services.

Using Unity Call Handlers to compose logic


The most common use of Unity Call Handlers (or Interceptors) is for cross-cutting concerns. I’ve demonstrated the use of such handlers in the past for things such as logging or caching. However, there’s another use for these handlers that allows us to build reusable blocks of business-related code that can be composed together to act in a number of ways: –

  • Filtering out data that is not appropriate for the target method
  • Enriching data before consumption by the target method
  • Enriching the return data after consumption by the target method

Call Handlers as a pipeline

This is possible because of the way call handlers work – they form a pipeline whereby each handler has the opportunity to prematurely end the pipeline flow at any point, creating a return object, or amend the input argument before passing it on onto the next handler.

image

  • The Blue lines indicate the initial flow from caller to target via each call handler.
  • Any call handler may decide to prematurely return to the caller without passing onto the target, indicated by a red line.
  • If the call makes it all the way to the target, it returns back up the stack to each handler, who cascades back up all the way to the source caller, shown in Green.

Filtering data with Call Handlers

Let’s start with a relatively simple example: a generic file parsing system which processes XML files dropped into folders. Each folder contains a different structure of XML file, and we have an appropriate parser class for each one. Perhaps we have ten different parsers. Now, let’s imagine that we wanted to filter out (i.e. not process) some files, for certain parsers – but not all of them – given some of the following conditions: –

  • A file is too old – say, over a week old
  • A file size is too big – say, over 10mb
  • A file contains an attribute named “Ignore” on the root XML element

Now, if we were writing these parsers (with unit tests, of course), let’s pretend our IParser interface looked something like this: –

image

Imagine that each XDocument has a header that contains things like date published and size etc. etc. Also imagine that when each parser implemented ParseDocument it would first perform any tests required to ensure that certain filter conditions had not been failed. Remember, some parsers will not need to do any filtering. Some might need all three filters. Others might only need one or two.

Fragility of unit tests

Even if we abstracted the logic of these filters in helper methods – or even with interface on top of them so we could stub them out – it would still mean your unit tests for each parser growing with each extra filter that you add e.g. if you had a parser that had five filter conditions, you would have to mock out the first four in order to prove that the fifth was correctly checked. Even worse, if you decided to re-order your filter checks (let’s say that you realise that the first one is slow to run so push it to the back), your unit tests would break.

image

In the example above, to test that we are calling the second filter (IsDocumentTooLarge()) is being called, we have to mock the result of IsDocumentMarkedAsIgnore() first. If we swapped the order of the calls, our unit tests would break.

Using Call Handlers to act as filters

What I really want to see is code like this: –

image

Each of these attributes should map to a handler which performs a single check, and either passes on to the next item in the pipeline, or returns prematurely. Our unit tests on the parser would simply be ones that verify we have the attribute on the correct method, as well as tests for the actual parsing. That’s it.

Even better, as each call handler lives in its own class and is completely decoupled from any parser, we can easily apply them to other parsers very quickly and easily.

In my next post, I’ll demonstrate a simple call handler to perform one of these filters, and talk a bit more about the other two uses of CallHandlers that I mentioned at the start of this post.

Why I Hate the Template Pattern – Part 2


In my last post, I discussed in detail why exactly I don’t recommend the use of the Template for writing testable code. Here I want to illustrate an alternative to it, but before I do that, I want to talk about a more fundamental aspect of OO design.

Inheritance vs Composition

The way I see it, inheritance is an expensive and heavyweight mechanism in many languages, particularly those that only offer a single inheritance model, such as C# and VB .NET. It’s not quite the same in others like C++ (although multiple inheritance brings with it its own problems), but in .NET languages, you have to be really careful about creating inheritance chains. You can get yourself into all sorts of ugly situations where your inheritance model is too fat, with abstract methods that are unnecessary etc. etc., or too inflexible to allow you to do something different.

So in these situations, you should seriously consider composing behaviours out of smaller objects. You normally have to do some delegation to those small objects, but the flexibility you gain compared to inheritance fair outweighs the cost of that delegation.

In this arbitrary example, we have two types of people – a guitarist and a developer. They can perform operations that are completely unrelated. Using an inheritance model we’d struggle to compose these together i.e. Does Developer inherit from Guitarist? The other way around? What if we only wanted the functionality from the derived class and not the base class?

With composition we approach the problem slightly differently: –

image

We create two interfaces, one for each behaviour, and then create one concrete for each interface. Neither have any relationship with one another, but we are now in a position to simply use them as we see fit. In this example we make a composite DeveloperWhoPlaysGuitar type that inherits from neither class but implements both interfaces. It instead stores a private instance of both “real” objects and delegates the calls to them as appropriate: –

image

This gives the illusion that we have a type that inherits from both Guitarist and Developer, and is a much more flexible model than strict inheritance, albeit now we have had to create a third class (the composed type above) rather than just two types with inheritance.

Applying Composition to the Template problem

We can apply this approach to solving the Template problem, which leads to another design pattern completely – the Strategy pattern. In our original example with File Readers, we would probably redesign the API as follows: –

image

We now have a single class called FileProcessor, which was originally the template base class. However, instead of having abstract methods on the class, we have shuffled them off to a new interface called IFileReader. All the concrete readers implement this interface. The FileProcessor now takes in a single instance of this interface when we call ReadFiles(): –

image

This way, the FileProcessor is now far less tightly coupled to the actual reading of files etc.. Its job is now simply to perform business logic checks around the reading of files, and then delegate the actual grunt work of reading the file and checking validity of the file to the reader itself.

Testing out a Strategy-based API

We could easily test out our new FileProcessor class since it is no longer tightly bound to any implementation of IFileReader; we could just inject a fake in. Similarly, we can now test our new CsvFileReader much more easily than before since it now exposes a public API that we can call: –

image

No mocks involved. No ridiculously large arrangement of code. No awkward naming of tests. Improved readability. Simples!

Conclusion

Whilst quick and easy to consider and create a class hierarchy using Template pattern, it is exceptionally difficult to test, even for relatively simple template methods. What you end up doing is essentially an integration test between the template base class and the classes that implement the abstract methods.

The Strategy pattern involves the overhead of creating an extra type (the interface containing what would have been the abstract methods) and passing it in to the driver class. However, it proves to be far easier to test each implementation of your interface in isolation, as well as to test the runner / processor class out because we no longer have a tightly bound relationship between the two.

In addition, we gained some flexibility because we can now reuse the IFileReaders across other classes that may want that functionality, and not just the FileProcessor.