What on earth has happened to NuGet?


After several months away from NuGet, I had to use it again recently in VS2015. I’m completely and utterly gobsmacked at how poor the current experience is. It’s confusing, inconsistent and hard to use. Worse than that, it enables workflows that should never, ever be permitted within a package management system.

First experiences with the NuGet dialog

When I first saw the previews of VS2015, I thought that the non-modal, integrated dialog into VS would be a good thing. Unfortunately, it suffers from not making it clear enough what the workflow is. You’re bombarded with dropdowns that sometimes have only one option in them, search boxes in non-intuitive positions, installation options that you don’t want to see etc.

Let’s start by adding a package to a solution. This is a common task so should be as easy as possible. Here I’m adding my old friend Unity Automapper to two projects in a solution (note that I’ve deliberately downgraded to an earlier version of the Automapper) :-

nuget1Why so many options? Why a dropdown for Action that only has one item in it? What does the “Show All” checkbox do – looks like nothing to me at this point. Are there projects hidden from the list? Why?

Working with packages

Next let’s try to do something to the package e.g. uninstall or update. You go to Manage Packages for Solution. You don’t see the packages that you’ve already – instead, you see all packages available, ordered by (I assume) most popular download. My installed package is nowhere to be seen. How do I find installed packages only? Oh. You have to click “Filter” and then select “Installed”. That’s not what I thought of doing when I selected the menu option. I thought it would just show it to me – it took me a good few seconds to realise what had happened!

Also notice that the default dependency behaviour is “lowest” – I don’t like this. Highest should be the default, or maybe Highest Minor. Nine out of ten times you’ll just have to upgrade them all immediately afterwards anyway. Which brings me nicely onto the subject of upgrading NuGet packages…

Upgrading dependencies

So now I’ve found my package, I want to upgrade it. So I choose to manage packages for solution, and change the action from Uninstall to Update. It picks the latest version available (good).

nuget2But then notice in the screenshot above it allows me to uncheck some of the projects within the solution. What?? Why would you want to explicitly upgrade only some of the projects within the solution! I can maybe think of some corner cases, but generally this is a definitely no-no – a recipe for absolute disaster. At best you’ll muddle on with some binding redirects, at worst you’ll get a runtime error at some indeterminate time in the future when you try to call a method that doesn’t exist. What happens if you have a breaking change between the two? Which version will get deployed? Can you be sure? Don’t do this. Ever. I can’t even understand why NuGet offers you as the user to do this.

Even worse, if you choose to update a NuGet dependency at the project level (by e.g. right clicking the references node in VS), VS will upgrade it on that project alone. The other projects that have that dependency will be left untouched. You’ll end up with different versions of a dependency in a solution without you even realising it.

But let’s see what happens if we decide to live dangerously and go ahead anyway. Firstly, NuGet will happily upgrade half our solution and leave the other half in an old state – no warnings of impending doom, no nothing. Your code might work, it might not. Maybe it’s work fine for some weeks, and then you’ll realise that the upgrade has a breaking change, but you only notice this when project A tries to call a method that no longer exists.

The next time you next enter the NuGet panel to work with your dependencies – e.g. to uninstall the dependency, you’ll see the following: –

Nuget3Why is only one of my two projects showing in the list of projects that I want to uninstall? Because it’s filtering it on that specific version of the dependency (1.1.0). I have to choose the version that I want to uninstall. I don’t want that – I just want to uninstall the entire dependency. But no, we have to do it version by version. If we now change the Action to “Update”, the Version dropdown suddenly has a different meaning. It no longer acts as a filter – it’s now stating what version we’ll upgrade to. Instead, you now filter based on the projects checked in the list below. So the meaning of each user control has changed based on the action you’re performing. This is not good UI design.

UI Thoughts

In fact, the UI as a whole is really, really confusing. There are also other issues – it’s slow. You can’t upgrade all projects through the UI yet, so either you have to fall back to the Package Manager console, or upgrade each package one by one. Of course, after you do any single update, the UI reverts back to showing “All” packages which involves loading the top results from NuGet. This takes a few seconds. Then you need to change the dropdown to show “Installed” packages, and start all over again.

Conclusion

I’m really worried having seen the new NuGet now. It’s been in VS2015 for a while – and developers are putting up with this? It’s slow. It’s not a well designed UI – even I can tell that. You don’t feel like you know what is actually going to get deployed. The workflows don’t really work. It’s not good.

And it’s not just the UI that concerns with me NuGet. It’s also the fundamental structure underneath it that is unchanged and needs to be fixed. NuGet should be managing dependencies across an entire solution as the default and ensuring that dependencies across projects are stable. Instead, we have individual package.config files on each project, each with their own versions of each dependency. This is not what you want! As illustrated above, it’s very possible – and in fact quite likely on a larger project – that you’ll quickly end up with dependencies across projects with different versions. I’ve seen it lots of times. You run the risk of getting runtime bugs or crashes, and worse still, ones that might only be identified when you go down a specific code path in your app, probably the day after it goes live.

I was hoping that NuGet in VS2015 would start to address these issues, but unfortunately it seems like a large step backwards at the moment.

MBrace, CloudFlows and FSharp.Data – data analysis made easy


In case you’ve not seen it before, MBrace is a simple programming model for scalable cloud data scripting and programming with .NET. It’s written in F#, but has growing support for C# and VB .NET. Over the past year or so, I worked closely with the MBrace team to help get it working smoothly on Microsoft Azure, using features such as Service Bus and Storage to provide an excellent development and deployment experience. As MBrace gears up for a v1 release, the design of the API is looking extremely positive.

I’m going to demonstrate here a simple example that illustrates how easy it is to start working with a large CSV file available on the internet in an MBrace cluster, parsing and querying data easily – we’re going to analyse UK house prices over the past year (this file is freely available on the gov.uk website).

I’m going to assume that you have an MBrace cluster up and running – if you don’t, you can either use a local development cluster or download the latest source code and deploy a full cluster onto Azure using the example MBrace Worker Role supplied in the MBrace Azure source code.

Type Providers on MBrace

We’ll start by generating a schema for our data using FSharp.Data and its CSV Type Provider. Usually the type provider can infer all data types and columns but in this case the file does not include headers, so we’ll supply them ourselves. I’m also using a local version of the CSV file which contains a subset of the data (the live dataset even for a single month is > 10MB): –

In that single line, we now have a strongly-typed way to parse CSV data. Now, let’s move onto the MBrace side of things. I want to start with something simple – let’s get the average sale price of a property, by month, and chart it.

A CloudFlow is an MBrace primitive which allows a distributed set of transformations to be chained together, just like you would with the Seq module in F# (or LINQ’s IEnumerable operators for the rest of the .NET world), except in MBrace, a CloudFlow pipeline is partitioned across the cluster, making full use of resources available in the cluster; only when the pipelines are completed in each partition are they aggregated together again.

Also notice that we’re using type providers in tandem with the distributed computation. Once we call the ParseRows function, in the next call in the pipeline, we’re working with a strongly-typed object model – so DateOfTransfer is a proper DateTime etc. All dependent assemblies have automatically been shipped with MBrace; it wasn’t explicitly designed to work with FSharp.Data – it just works. So now that we have an array of int * float i.e. month * price, we can easily map it on a chart: –

MBrace1Easy.

Persisted Cloud Flows

Even better, MBrace supports something called Persisted Cloud Flows (known in the Spark world as RDDs). These are flows whose results are partitioned and cached across the cluster, ready to be re-used again and again. This is particularly useful if you have an intermediary result set that you wish to query multiple times. In our case, we might persist the first few lines of the computation (which involves downloading the data from source and parsing with the CSV Type Provider), ready to be used for any number of strongly-typed queries we might have: –

So notice that the first query takes 45 seconds to execute, which involves downloading the data and parsing it via the CSV type provider. Once we’ve done that, we persist it across the cluster in memory – then we can re-use that persisted flow in all subsequent queries, each of which just takes a few seconds to run.

Conclusion

MBrace is on the cusp of a 1.0 release – it’s ready for you to start using now, and offers not only a powerful and flexible set of abstractions for distributed computations, but as you can see from above, if you’ve used the collection libraries in F# before it’s a very smooth transition to make the leap to distributed collection queries. In less than ten lines of code, you can start writing distributed queries against live datasets with the minimum of effort.

Stateless services on Azure Service Fabric in F#


In my previous posts, I discussed the use of the Service Fabric (SF) actor framework (which is loosely based on Orleans) and F#, and how we can use FP features within an actor model, even one designed for OO languages.

Exposing Services with Service Fabric

Ironically, the actor framework with SF is one of its more complex features – you can use SF to host literally any .NET code you want. There are a number of features within SF designed to allows you to rapidly host scalable systems, with support for state replication out-of-the-box. In this post, I want to illustrate the steps needed to host the F#, FP-first web server Suave in Service Fabric. It turns out that there’s really not much code needed at all.

  1. We create an F# executable that is compatible with Service Fabric.
  2. We create a service that inherits from the StatelessService class (we’ll discuss Stateful Services in another post).
  3. We override the CreateCommunicationListener method. This is important – essentially this method’s responsibility is to create an object that can handle incoming traffic from external sources. We also make a note of the port that Suave will be running on.
  4. We configure an endpoint in the Service Fabric configuration for that same port. This tells SF to allow inbound traffic in. This is roughly analogous to opening up an endpoint in Cloud Services. This is something you should have also specified when creating the cluster itself in Azure (if not, you’ll need to manually configure the load balancer to allow traffic through).
  5. In our Main program, we register the service with SF.

The key part is (3), where we implement the functionality that should get called to handle incoming requests. It’s pretty basic really: –

So CreateCommunicationListener() expects an instance of ICommunicationListener that will create the web server for us. Luckily with F#’s object initializers we don’t even have to declare a formal type – we can simply create the object on the fly. As you can see, all it does is start up Suave using default settings. You might elect to supply the port that it starts on from the endpoint configuration in Service Fabric – this is done in the Initialize method, and is included in the full sample.

Once done, you can configure the scalability of the service in config – if you want three instances, just set the instance attribute to 3 in the ApplicationManifest file of your hosting Service Fabric application. If you want it on every node, you set the attribute to -1 (because we all know that -1 is the universal standard for “absence of a number” – we don’t need no option types ;-). Note that running this locally with multiple instances won’t work, since they all try to run on the same port, but in the real world it’d work fine I’m sure.

As an aside, if you want any arbitrary service that doesn’t necessarily need incoming traffic e.g. something subscribing to service bus or writing to a DB, you don’t have to implement anything regarding ICommunicationListener. There’s simply a RunAsync() method that you can put any code inside that you want.

So, there you have Suave in Service Fabric with a minimal amount of code. For this you’ll get an auto-load balancing, scalable and automatically healing service. In my next post, I’ll demonstrate StatefulServices and how we can use them to automatically manage state across a cluster of services.

Building Azure Service Fabric Actors with F# – Part 2


In Part 1, I provided an overview of what Service Fabric (SF) is, and provided some step-by-step guidance on how to get up and running with the Service Fabric local installation. In this post, I want to move from the infrastructure to the code, and show how we can use F# with an Actor model designed primarily for C# and VB .NET, whilst still retaining an idiomatic F# feel where possible.

All code for the full sample used as the basis for this series is available here.

Actors in Service Fabric

Firstly, I’ll show you some elided examples of how we modelled some features of my cat as an Actor in Service Fabric. Every cat has some state which is affected by actions it does, and needs to be persisted across calls. In Service Fabric, we call this a “stateful” actor. After every “state-updating” action (in SF terms, this equates to a method call on the actor), SF will automatically persist your state back to disk and automatically replicate to other nodes in the SF cluster (typically at least two others); if your primary node goes down, one of the secondaries will immediately take over and the failed node will be silently replaced in the background. You can also have so-called “read only” actions, which do not modify state but typically return some payload to the caller. You can typically think of these as “getter” methods / properties on a class. You’ll normally have a mix of both state-mutating and read-only methods on a given actor.

Implementing Stateful Actors in F#

Every stateful Actor in SF inherits from the type Actor<T>, where T is the state that needs to be persisted. It shows up as a member property on the actor, State. Service Fabric will automatically create one of these when starting every given actor, and silently persist / load it across calls etc.

We’ll start by modelling the state on the Actor by default with a standard OO class in F# – see below. Notice the DataContract and DataMember attributes – these are used by the persistence layer of SF to de/re-hydrate state to an Actor. Personally I’m not particularly fond of these attributes – there are plenty of serialization frameworks out there that seem to work just fine without decorating every single property, so why are we stuck with this old-school approach? Perhaps there’s a way to replace the serialization in SF – I haven’t tried yet.

Anyway, here’s an example method on Cat, called Jump(). It takes in a destination of where the cat is jumping to, and depending on the destination, this affects the cat – and the owner’s – Happiness (in a more fully featured model, the owner themselves would probably be an actor with their own state). The cat will also work up an appetite by Jumping(). Hunger can be alleviated by Feeding() the cat.

On the one hand, F# works nicely with interfaces – we still don’t have to specify types, as they are inferred from the interface we’re implementing. However, this sample is still somewhat unsatisfactory to me as an F#-first person: I’m used to creating copies of data from other data, not mutating it. I also don’t like this approach of modifying state in several places arbitrarily – I feel uneasy when seeing code like this. It seems very statement oriented, with side effects everywhere – something I struggle to reason about easily. There must be something better!

Use immutable data structures on Actors

As it turns out, there is. Notice that up until now we’ve basically written everything in an OO style, using standard C#/ VB constructs like classes etc. – we’ve not used any F# types. We can actually use many F# features without too much fuss, and they can quickly help us out in our quest to getting back to sane and easy-to-reason-about code.

Firstly, we can change the way we model our state from a class to an F# record. This actually works without any problem, once you do the same WCF-style attribute decoration, and add the [<CLIMutable>] attribute – this is necessary as although Records boil down to standard Classes, by default there’s no public setter on any properties, so SF can’t rehydrate state by default. We can also add in other F#-only features, like units of measure, if we want – as these are a compile-only feature, there’s no issue with serialization of them.

On their own, using records within SF only works up to a point – we’re forced to make copies of state, rather than mutating the single attributes of the State member multiple times, which is a good thing. However, it still looks undesirable – we’re now just mutating the State member property on the Actor instead! Plus it’s not clear when and where we should replace the contents of the State member within the method – every time? Once at the end of the method call? Something in between?

Adapting functional patterns into Actors

Let’s take a step back and think about the two types of methods I mentioned earlier on – state-updating and read-only calls. The former intends to do some processing, and update the State of the actor. The latter typically reads from the State and returns some data to the caller (I’m setting aside things like calling external dependencies etc. which for simplifies’ sake we can ignore – plus it really doesn’t affect us here as we would partially apply our functions with dependencies). We can formally specify such actions and implement them with something like this: –

Notice how now our functions are much simpler – Jump is made up of a single expression that generates the new State of the Actor, based on the input state and distance – we’re no longer mutating state multiple times, or even once. And because State is an immutable record, it’s impossible to modify the supplied input State ever.

Plugging pure functions into Actors

Now that we’ve formalised how we see our actor methods working, we can re-write our earlier code from the anything-goes, mutate-everywhere C# style to one that is easier to test, easier to reason about and more idiomatic from an FP, F# point of view. You’ll notice that the implementation code above is back in a module – so how do we plug this into our OO Actor model?

There are a few ways, but the easiest one is with the help of a couple of shim functions that tightly control the mutation of the Actor State, whilst delegating control to our purely functional code for business logic. Our core code is kept free from worrying about the mutation of state and is performed in a consistent manner; our SF Actor model simply delegates to them.

A word on Read-Only Service Fabric methods

Another point worth mentioning are Read Only methods in Service Fabric. These are methods that you, as the developer, tell the SF runtime “I will never amend state in this method – don’t try to persist state at the end of the call”. This is achieved in SF simply by placing the [<Readonly>] attribute on the method. I don’t like this much for two reasons. Firstly, the attribute differs from the System.ComponentModel [<ReadOnly>] attribute simply by virtue of the fact that it has a different casing on one of the characters in the type. Use the wrong one accidentally and things will quickly go pop with your actor (believe me – I did it during the creation of the code referenced in this post; the error that you get isn’t helpful either). The other, more dangerous issue is that there is no compile time safety around the use of the [<Readonly>] attribute. If you decide to start changing state in one of these calls – tough. You won’t get any support from the compiler, nor from the runtime. Your method simply won’t update state and you’ll be left wondering why your application isn’t behaving correctly.

With the “adapt to a functional style” approach, whilst we don’t eliminate the issue completely – you still have to decorate the methods appropriately – we at least get compile-time checking on read-only functions, because they don’t allow us to return state; you therefore can’t accidentally modify the state of an actor. In addition, because we’re now using records, which are themselves immutable, it’s impossible for us to modify the state that was supplied to us.

For a simple example like the one supplied, one could argue that the extra delegation and modules etc. complicates matters compared to e.g. C# / OO. However, once you start writing even mildly complicate business logic, it quickly becomes a tiny cost compared to the simplification you benefit from through immutability, records etc.. as well as the usual other benefits of F#.

Taking it further

You can take this approach even further – in other actor frameworks, rather than adopting the “method-per-action” approach, a more functional approach is to have a single message which is itself a discriminated union containing all the different messages ; we then pattern match on this in order to process the message appropriately. We can apply this sort of pattern for updating-state messages, although it isn’t exactly idiomatic SF actor code (I’ve supplied an example in the source code).

Another alternative might be to create a custom Computation Expression (perhaps similar to the Writer monad that Tomas Petricek blogged about many moons ago) in order to make this modification to state even more succinct. Perhaps someone could write one 😉

Conclusion

We’ve seen how we can marry up some features inherent to the F# type system in order to enforce a cleaner way of reasoning about the code that our actors have to implement, through a couple of simple function signatures and some simple adaptors. We’ve also seen how F#, and typical FP paradigms, can be used in an reliable and distributable framework designed for a mutable-first OO consumer.

In part three, I want to illustrate how we can quickly and easily host arbitrary services on top of Service Fabric in F# for just about any code you might want to write, and how we can easily scale it to large volume.

Building Azure Service Fabric Actors with F# – Part 1


This post is the first part of a brief overview of Service Fabric and how we can model Service Fabric Actors in F#. Part 1 will cover the details of how to get up and running in SF, whilst Part 2 will look at the challenges and solutions to modelling stateful actors in a OO-based framework within F#.

What is Service Fabric?

Service Fabric is a new service on Azure (currently in preview at the time of writing) which is designed to support reliable, scalable (at “hyper scale”) and maintainable distributed applications and services – with automatic support for things like replication of state across nodes, automatic failover & recovery and multi tenanting services on the same instances. It supports (currently) both stateful and stateless micro-services and actor model architectures (more on this shortly). The good thing about Service Fabric (SF) from a risk/reward point of view is that it’s not a new technology – it actually underpins a lot of existing Azure services themselves such as Azure SQL, DocDB and even Cortana, so when Microsoft says it’s a reliable and scalable technology, they’ve been using it for a while now with a lot of big services on Azure. The other nice thing is that whilst it’s still private preview for running in Azure, you can get access to a locally running SF here. This isn’t an emulator like with Azure Storage – it’s apparently the “full” SF, just running locally. Nice.

Actors on Service Fabric

As mentioned, SF supports an Actor model in both stateful and stateless modes. It’s actually based on the Orleans codebase, although I was pleasantly surprised to see that there’s actually no C# code-generation whatsoever in SF – the only bit that’s auto-generated are some XML configuration files which I suspect will be pretty much boilerplate for most people and rarely change.

Why would you want to try SF out? Well, simply put, it allows you to focus on the code you write, as opposed to the infrastructure side of things. You spin up an SF cluster (or run the local version), deploy your code to it, and off you go. This is right up my alley, as someone who likes to focus on creating solutions and sometimes has little patience for messing around with infrastructural challenges or difficulties that prevent me from doing what I’m best at.

Getting up and running with Service Fabric

I’ve been using Service Fabric for a little while now, and spent a couple of hours getting it up and running in F#. As it turns out, it’s not too much hassle to do aside from a few oddities, which I’ll outline here: –

  • Download and install VS2015. Community edition should be fine here. You’ll also need WIndows 8 or above.
  • Download and install the SDK.
  • Create a new Service Fabric solution and an Stateful Actor service. This will give you four projects: –
    • A SF hosting project. This has no code in it, but essentially just the manifest for what services get deployed and how to host them.
    • An Actors project. This holds your actor classes and any associated code; it also serves as a bootstrapper that deploys the appropriate services into SF; as such, it’s actually an executable program which does this during Main(). It also holds a couple XML configuration files that describe the name of the package and each of the services that will be hosted.
    • An Interfaces project. This holds your actor interfaces. I suspect that this project could just as easily be collapsed into the actors one, although I suppose for binary compatibility you might want to keep the two separate so you can update the implementations without redeploying the interfaces to clients.
    • A console test project. This just illustrates how to connect to the Service Fabric and create actors. In the F# world these projects serve zero purpose since we can just create a script file to interact with our code, so I deleted this immediately.
  • Convert to Paket (optional). If you use Paket rather than Nuget for dependency management, change over now. The convert-from-nuget works first time; you’ll end up with a simplified packages file of just a single dependency (Microsoft.ServiceFabric.Actors), plus you’ll get all the other benefits over Nuget.
  • Create F# project equivalents. The two core projects, the Actors and Interfaces projects, can simply be recreated as an F# Console App and Class Library respectively. The only trick is to copy across the PackageRoot configuration folder from the C# Actors project to the equivalent F# one. Once you’ve done this, you can essentially disregard the C# projects.
  • Configure the F# projects. I set both projects to 4.5.1 (as this is what the C# ones default to) – I briefly tried (and failed) to get them up and running in 4.5.2 or 4.6. Also, make sure that both projects target x64 rather than AnyCPU. This is more than just changing the target in the project settings – you must create a Configuration (via Configuration Manager) called x64!
  • Create an interface. This is pretty simple – each actor is represented by an interface that inherits from IActor (a marker interface). Make sure that all arguments in all methods all have explicit names! If you don’t do this, your actors will crash on initialisation.
  • Create the implementation. Here’s an example of a Cat actor interface and implementation.

  • Update the Hosting project. Reference the implementation from the Hosting project and update the configuration appropriately.

Luckily, I’ve done all of this in a sample project available here.

Running your project

Once you’ve done all this, you can simply hit F5 (or Publish from the Host project) and watch as your code is launched into the Fabric via the UI.

SFEYou can then also call into your actors via e.g. an F# script:-

I’m looking forward to talking more about the coding side of this in my next post, where we can see how code that is inherently mutable doesn’t always fit idiomatically into F#, and how we can take advantage of F#’s ability to mix and match OO and FP styles to improve readability and understanding of our code without too much effort.

Distributing the F# Mailbox Processor


Note: This blog post is part of the 2014 F# Advent Calendar. Be sure to check out yesterday’s Intro to Data Science post by Jon Wood!

Mailbox Processors 101

If you’ve been using F# for any reasonable length of time, you’ll have come across the MailboxProcessor, AKA the F# Agent (or Actor). Mailbox Processors are cool. They give us the ability to offload load to background processors without worrying about managing the thread that they live on (as agents silently “go to sleep” when they aren’t processing anything), and they take away the pain of locking as they ensure that only one message will be processed at a time whilst automatically queuing up backed up messages. They also allow us to visualise problems differently to how we might do so when just using a raw Task, in terms of message passing. We can partition data based by pushing them to different actors and thus in some cases eliminate locking issues and transactions all together.

Here’s a sample Mailbox Processor that receives an arbitrary “message” and “priority” for a specific user and automatically accumulates them together into a file (let’s imagine it’s a distributed file e.g. Azure Blob Storage or similar):-

As is common with F#, we use a discriminated union to show the different types of messages that can be received and pattern match on them to determine the outcome. Also notice how the agent code itself is wrapped in a generator function so that we can easily create new agents as required.

All this is all great – what’s not to love? Well, in turns out that the Mailbox Processor does have a few limitations: –

  • In process only. You can’t distribute workloads across multiple machines or processes.
  • Does not handle routing. If you want multiple instances of the same agent, you need to manually create / destroy those agents on demand, and ensure that there is a way to route messages to the correct agent. This might be something as simple as a Map of string and agent, or something more complex.
  • Does not have any fault tolerance built-in. If your agent dies whilst processing a message, your message will be lost (unless you have written your own fault handling mechanism). If the agent crashes, all queued messages will also be lost.

These are things that are sometimes native in other languages like Erlang, or provided in some frameworks.

Actors on F#

At this point, it’s worth pointing out that there are already several actor frameworks that run on F# (some of more utility than others…): –

  • Akka .NET – a port of the Scala framework Akka. Akka is a tried-and-tested framework, and the port is apparently very close to the original version, so it might be a good call if you’re coming from that background.
  • Cricket – an extensible F# actor framework that has support for multiple routing and supervision mechanisms.
  • Orleans – an actor framework by Microsoft that was designed for C# but can be made to work with F#. However, it’s not particularly idiomatic in terms of F#, and seems to have been in alpha and beta forever.
  • MBrace – a general distributed compute / data framework that can be made to work as an actor framework.

One commonality between the above is that none of them use the native Mailbox Processor as a starting point. I wanted something that would enable me to simply lift my existing Mailbox Processor code into a distributed pool of workers. So, I wrote CloudAgent – a simple, easy-to-use library that is designed to do one thing and one thing only – distribute Mailbox Processors with the minimal amount of change to your existing codebase. It adds the above three features (distribution, routing and resilience) to Mailbox Processors by pushing the bulk of the work into another component – Azure Service Bus (ASB).

Azure Service Bus

ASB is a resilient, cheap and high-performance service bus offered on Azure as a platform-as-a-service (PAAS). As such, you do not worry about the physical provisioning of the service – you simply sign into the Azure portal, create a service bus, and then create things like FIFO queues or multicast Topics underneath it. The billing for this is cheap – something like $0.01 for 10,000 messages, and it takes literally seconds to create a service bus and queue.

How do we use ASB for distribution of Mailbox Processors? Well, CloudAgent uses it as a resilient backing store for the Mailbox Processor queue, so instead of seeing messages stack up in the mailbox processor itself, they stack up in Azure instead, and are pulled one at a time into the mailbox processor. CloudAgent automatically serializes and deserializes the messages, so as far as the Mailbox Processor is concerned, this happens transparently (currently this is JSON but I’m looking to plug in other frameworks such as FSPickler in the future). We’ll see now how we use the features of ASB to provide the previously mentioned three features that we want to add to Mailbox Processors.

Distribution and Routing

These first two characteristics can essentially be dealt with in one question: “How can I scale Mailbox Processor?”. Firstly, as we’re using Service Bus, it automatically handles both multiple publishers and subscribers for a given FIFO queue. This allows us to push many messages onto a queue, and have many worker processes handling messages simultaneously. This is something CloudAgent does automatically for us – when a consumer starts listening to a Service Bus Queue, it will immediately start polling for new messages (or, as we’ll see shortly, sessions), and then route then to an “appropriate” worker. What does this mean? To answer that, we need to understand that there are two types of worker models: –

Worker Pools

Worker Pools in CloudAgent are what I would classify as “dumb” agents. They do not fit in with the “actor” paradigm, but more for processing of generic messages that do not necessary need to be ordered in a specific sequence, or by a single instance. This might be useful where you need “burst out” capability for purely functional computations that can be scaled horizontally without reliance on other external sources of data. In this model, we use a standard ASB queue to hold messages, and set up as many machines as we want to process messages. (By default, CloudAgent will create 512 workers per node). Each CloudAgent node will simply poll for messages and allocate each one to a random agent in its local pool.

Actor Pools

Actor Pools fit more with the classic Agent / Actor paradigm. Here, messages are tagged with a specific ActorKey, which ensures that only a single instance of a F# Mailbox Processor will process messages for this actor at any one time. We use a feature of ASB Queues, called “Sessions”, to perform the routing: Each CloudAgent node will request the next available “session” to process; the session represents the stream of messages for a particular actor. Once a session is made available (by a message being sent to the queue, with a new actor key), this will be allocated to a particular worker node, and subsequently to a new instance of Mailbox Processor for that actor (CloudAgent maintains a map of Actor Key / Mailbox Processors for local routing).

So if you send 10 messages to Actor “Joe Bloggs”, these will all be routed to the same physical machine, and the same instance of Mailbox Processor. Once the messages “dry up”, that specific mailbox processor will be disposed of by the CloudAgent node; when new messages appear, a new instance will be allocated within the pool and the whole cycle starts again.

Here’s an example of how we would connect our existing Mailbox Processor from above into CloudAgent in terms of both producer of messages (equivalent of Post) and consumer of messages:

Notice that there is no change to the Mailbox Processor code whatsoever. All we have done is bind up the creation of a Mailbox Processor to ASB through CloudAgent. Instead of us calling Post on an agent directly, we send a message to Service Bus, which in turn is captured by CloudAgent on a worker node, and then internally Posted to the appropriate Mailbox Processor. In this context, you can think of CloudAgent as a framework over Mailbox Processors to route and receive messages through Azure Service Bus Queues.

Resiliency

An orthagonal concern to routing and distribution is that of message resiliency. One of the features that we get for free by pushing messages through Azure Service Bus is that messages waiting to be processed are by definition resilient – they’re stored on Service Bus for a user-defined period until they are picked up off the queue and processed. You might set this to a minute, a day, or a week – it doesn’t matter. So until a message starts to be processed, we do not have to worry if no consumers are available. But what about messages that are being processed – what if the Mailbox Processor crashes part way through? Again, CloudAgent offers us a way of solving this: –

Basic Agents

Basic Agents contain no fault tolerance within the context of message processing. This actually fits nicely within the context of the standard “Fire-and-forget” mechanism of Posting messages to F# MPs. Whilst it’s your responsibility to ensure that you handle failures yourself, you don’t have to worry about repeats of messages – they will only ever be sent to the pool once. You might use this for messages that might go wrong once in a while where it’s not the end of the world if they do. With a Basic Agent, the above Mailbox Processor code sample would not need to change at all.

Resilient Agents

Service Bus also optionally gives us “at least once” processing. This means that once we finish processing a message, we must response to Service Bus and “confirm” that we successfully processed it. If we don’t respond in time, Service Bus will assume that the processor has failed, and resend the message; if too many attempts fail, the message will automatically get dead-lettered. How do we map this “confirmation” process into MailboxProcessors? That’s easy – through a variant of the native PostAndReply mechanism offered by Mailbox Processors. Here, every message we receive contains a payload and a reply channel we call with a choice of Completed, Failed or Abandoned. Failure tells Service Bus to retry (until it exceeds the retry limit), whilst Abandon will immediately dead-letter the message. This is useful for “bad data” rather than transient failures such as database connection failures, where you would probably want the retry functionality.

Here’s how we change our Mailbox Processor code to take advantage of this resilient behaviour; notice that the Receive() call now returns a payload and a callback function that gets translated into ASB. We also add a new business rule that says we can never have more than 5 messages saved at once; if we do, we’ll reject the message by Abandoning it: –

At the cost of having to explicitly reply to every message we process, we now get retry functionality with automatic dead lettering. If the agent crashes and does not respond in a specific time, Service Bus will also automatically resend the message for a new agent to pick up. Bear in mind though that this also means however that if the first consumer does not respond in time, Service Bus will assume it has died and repost it – so a message may be posted many times. Therefore, in this model, you should design your agents to process messages in an idempotent manner.

Conclusion

Here’s a screenshot of three processes (all running on the same machine, but could be distributed) subscribing to the same service bus through CloudAgent and being sent messages for three different Actors. Notice how all the messages for a given actor have an affinity to a particular console window (and therefore consumer and agent): –

CloudAgentService Bus enables us to quickly and easily convert Mailbox Processors from single-process concepts into massively scalable and fault-tolerant workers that can be used as dumb workers or as actors with in-built routing. The actual CloudAgent code is pretty small – just four or five .fs files and around 500 lines of code. This isn’t only because F# is extremely succinct and powerful, but also because Azure is doing the heavy lifting of everything we want from a messaging subsystem; when coupled with the Message Processor / Agent paradigm, I believe that this forms a simple, yet compelling offering for distributing processing in a fairly friction-free manner.

Lightweight websites with F#


There are several common approaches I’ve seen people take on the .NET platform when writing web-based applications that I want to review in terms of language and framework choice: –

  • Adopt a conventional MVC application approach. Write static HTML that is emitted from the server using e.g. Razor markup + C# / VB .NET, write your controllers and any back-end logic in C#.
  • As above, but replace your back-end logic with F#. This is a reasonable first step to take, because essentially all your data access “back-end” processing are performed in a language that it’s best suited for, whilst your C# is relegated to essentially thin controllers and some simple markup logic.
  • Adopt a “SPA”-style approach. But this I mean split your web application into two distinct applications – a client-side application that is self-managing, typically using Javascript and some framework like Knockout or AngularJS; meanwhile your back-end is a hosted WebAPI written in F#.
  • Write the entire application in F#. Surely you can’t write websites in F# can you? Well, actually, there are some (pretty sophisticated) frameworks like WebSharper out there that can do that, rewriting your F# into e.g. Typescript and the like.

I haven’t used WebSharper in depth so can’t comment on the effectiveness of writing your client-side code in F# and therefore not going to talk about the latter option today. but I have written WebAPIs in F# and want to talk about where I do think your separation of concerns should lie with respect to client and server side code.

As far as I’m concerned, if you’re a .NET developer today, writing websites, then you should be writing as much as of the CLR-side code as possible in F#. I am really pleased with the brevity that you can get from the combination of OWIN, Katana (Microsoft’s web-hosting OWIN framework), Web API and F#. This combination will allow you to create Web APIs that can be created simply and easily, and when combined with a SPA client-side website is a compelling architectural offering.

Sudoku in F#

Some months ago, I wrote a Sudoku solver in F# (I think that there’s a gist somewhere with the implementation). I wanted to try to write a website on top of it with a visual board that allowed you to quickly enter a puzzle and get the solution back. So, having borrowed some HTML and CSS from an existing website, I set about doing it. You can see the finished site here and the source code is here.

Untitled2

Client

  • HTML
  • AngularJS
  • Typescript (no native Javascript please!)

 Server

  • F#
  • F#
  • F#

Standard JSON is used to pass data between website and server. On the server side, we use OWIN, Katana and Web API to handle the web “stuff”. This then ties into the real processing with the minimum of effort. This was all done in a single solution and a single F# project.

OWIN with F#

I’m no Angular or Typescript expert so I’m not going to focus on it – suffice it to say that Typescript is a massive leap over standard Javascript whilst retaining backwards compatibility, and AngularJS is a decent MVC framework that runs in Javascript. What I’m more interested in talking about is how to host and run the entire site through a single F# project. Mark Seeman‘s excellent blog has already discussed creating ASP .NET websites through F#, and there are indeed some templates that you can download for Visual Studio that enable this. However, they still use ASP .NET and the full code-bloat that it comes with. Conversely, using OWIN and Katana, this all goes away. What I like about OWIN is that there’s no code-generation, no uber folder hierarchies or anything like that, you have full control over the request / response pipeline, plus you get the flexibility to change hosting mechanisms extremely easily. To startup, all we need is to download a (fair) few NuGet packages, and then create a Startup class with a Configuration method: –

Now you have that you can simply create Web API controllers: –

So two F# files, a web.config and you’re good to go from a server-side point of view. Talking of web.config – how do you create an F# web project? Mark Seeman’s blog gives full details on creating Visual Studio web projects that are F# compliant, but essentially just adding the “Project Type” GUID in the .fsproj file (I think it’s 349C5851-65DF-11DA-9384-00065B846F21) will do the job.

Combining client and server side assets

UntitledBecause this is a full .NET web project you can do all the things that you would normally do in C# web projects, such as serve up static files (perfect for a SPA) like HTML, Javascript and CSS, as well as generating from Typescript files (just add an project import for the Typescript msbuild target). If you appreciate the extra security you get from F# over other statically typed .NET languages, you’ll almost certainly want to use Typescript over raw Javascript as well, so this should be a given.

A single project that can serve up your web assets and the server side logic to go with it looks pretty simple – in this screenshot, in the api folder is my back-end logic – message contracts between client and server, the actual puzzle solver and the Web API controller.

Client side assets are few and far between – just a SudokuController.ts to hold the controller logic and Index.HTML + stylesheet for the presentation layer. It’s important to note that with a SPA framework like AngularJS, you serve static HTML and Javascript; the Javascript then essentially bootstraps, modifying the HTML dynamically, requesting JSON from the WebAPI and occasionally getting more static HTML. You never modify HTML on the server as you would do with something like Razor.

In addition, as it’s a “normal” website, with Visual F# 3.1.2, you can use Azure websites to deploy this easily – either through VS to manually publish out to Azure, or through Azure’s excellent source control integration to e.g. GitHub or BitBucket webhooks. It’s never been easier to get a CI deploy of a website out.

More flexibility with Web API

Another important thing about Owin is that it separates out the hosting element of the website from the actual project structure. So, after talking about all this nice website project integration, there’s actually nothing to stop you creating a standard F# library project, and then use either the Owin WebHost console application (available over NuGet), or create an empty website or Azure worker and then host it through that via the Owin Host. All this can be done without making any changes to your actual configuration class or actual controllers.

Conclusion

A common misconception around F# is that it’s great for use as a “computation engine” where you give it a number and it gives you back another number. Or perhaps a “data processing engine” where it can read some data from a flat file or a web service and do something to it. These are both true – however, there is very little reason why you can’t use it for full featured Web APIs using Owin (as there’s no code generation to miss out on from e.g. VS/C# projects), and with a minimum amount of effort, even as a full website host for a SPA that will consume that same Web API.

In my next post I want to replace the use of Typescript with F# using Funscript to illustrate how you can have a fully-managed end to end solution for both client and server in F#.