Running F# on Microsoft Azure


As we keep seeing the same questions on Twitter (or elsewhere) regarding how to run F# applications on Azure, someone suggested to me that I write a post that outlines what your options are, and when and where you should use one service or the other.

.NET on Azure

Azure has always had a good “developer-focused” story for deploying .NET code. Unlike, say, Amazon Web Services (which started with a VM-first story and only more recently has started to offer higher-level abstractions), Azure has always been about offering the .NET developer ways to run your code in a way that are abstracted from the VM. In the early days of Azure, there was really only one option for hosting code (and zero options for running VMs), but as the number of Azure services offered grows at an incredible rate, there are now a number of different options available out-of-the-box in Azure. So, here’s a brief outline of some of the compute options that you have in Azure currently.

Disclaimer: If you’re an Azure boffin and notice some simplifications in this posting – it’s by design :-) But please let me know if you notice something that’s completely wrong!

Cloud Services

Cloud Services were as far as I’m aware the first platform service offering on Azure. You write your .NET code, you wrap it in a “cloud service”, and push it into Azure. Azure will handle provisioning of a VM for you, deploy your code onto it, and provide you with a load-balancer from which you can expose specific ports to your application as required. Alternatively, you can use e.g. some of the Azure messaging mechanisms such as Queues or Service Bus for sending commands to your worker role.

The process of wrapping your application in a Cloud Service basically involves bundling up your application into a zip file which contains the application files plus metadata about the service e.g. the VM size that you wish to run on, how many instances you want to run, the name of the service etc. etc.

Once your code is running, you can scale out the service to a number of “instances” of that service (aka “role”) by dragging a slider within the Azure portal, or use an auto-scaler which scales based on a number of metrics e.g. CPU utilisation.

Of course, Visual Studio has a number of templates included out of the box for creating such projects, including (which might surprise some people) an F# one. The only issue is that FSharp.Core isn’t included on Azure VMs by default, so you’ll need to ensure that it gets included in the output of your code (either by setting CopyLocal to True, or using the FSharp.Core nuget package).

The good thing about Cloud Services is that they’re very flexible – you can run any arbitrary .NET code (or non .NET) on them, as they essentially run in a similar manner to Windows Services – you get a number of events such as when the service is spawned, when it shuts down, if there’s a change to the configuration etc. You also get an API for configuration settings and even a distributed cache service across all instances of a role (although my understanding is that this service is being deprecated in the next 12 months).

However, Cloud Services are starting to show their age – there’s no support for e.g. direct integration with GitHub or other source control mechanisms. Instead, you need to package up code into the “.cspkg” format and deploy – which can take quite a while (minutes rather than seconds). Furthermore, there’s not much flexibility in terms of deployment – you write your code, you deploy it, and it maps to a VM. There’s no scope for multi-tenanting multiple services on a VM unless you manually do this yourself within the context of the Cloud Service at the code level, and configuration is a little bit of a pain compared to some of the other options out there.

F# and Cloud Services

Use Cloud Services for any F# applications you want to use – think of Cloud Services as distributed, replicated Windows Services. As an example, the MBrace Azure project runs on Cloud Services. Just note, that because they’re so flexible, you might have to do some plumbing to fit a Cloud Service to your specific needs (this isn’t just in F#) such as opening ports or the like, and there’s no specific Cloud Service APIs to connect to other Azure services.

Web Apps

Web Apps (formerly Azure Websites) were originally designed in order to simplify the use case for hosting ASP .NET applications in Azure as opposed to using Cloud Services, responding to HTTP traffic over port 80. As such, there’s out-of-the-box support for hosting any ASP .NET application in IIS directly from a source code repository e.g. GitHub; in addition, you can use the WebDeploy mechanism and tooling built into Visual Studio, plus there’s FTP support. Web Apps also come with the Kudu scripting language / tooling which is a kind of batch scripting set of commands for things like deployment-time file diffs. WebApps also support eight “deployment slots” that can host multiple versions of your website. So you can deploy to a number of “slots”, and hot-swap the “live” one in and out with basically instant results.

But, Web Apps are written at a higher level of abstraction than, say, Cloud Services – think of it as “IIS as a service” rather than just “code as a service”. This is what they’re designed for, and everything about them, from all the examples you see on the Azure website to the out-of-the-box reports and metrics in the portal, shows that they are geared up for HTTP-enabled applications. There are actually some good examples of using F# within ASP .NET – either directly using some of the third-party Visual Studio project templates etc., or using an IIS module that allows you to redirect all HTTP requests to any arbitrary executable – you can use such a mechanism to e.g. run Suave on Azure websites.

Of course, there’s nothing to stop you writing an application that ignores HTTP traffic and communicates through e.g. Azure Queues or whatnot, but you’d probably be better off using Cloud Services for something like that. As a “get out” for the HTTP route, you can also make use of “Web Jobs”, which are arbitrary executables that run within the context of a Web App, and can be set up to run or a schedule, continuously, or whenever e.g. a message gets dropped onto a storage queue. So imagine an e-commerce website where search results are displayed based on a batch-generated set of read-only database results or similar. You can use a web job that runs every n minutes or whenever some data changes to regenerate the table. This all runs within the context of the “website”, on the same VM. So if you want to scale the web job, you also have scale up website at the same time.

Web Apps are a great option for running applications that are geared up towards serving HTTP requests, particularly if you’re running an ASP .NET application over IIS. There are also some options for running other types of application e.g. batch processes within the context of the website. You can also run them for less money than Cloud Services, as the cheaper options allow for hosting multiple web apps within a single physical VM. So whilst Wep Apps lack the flexibility of Cloud Services e.g. you don’t get the same choice of VM sizes and run within the framework of IIS, they’re a much more modern option, with better support throughout Azure, and you should consider this where possible. There’s also much better support on the newer portal for Websites, and features like authentication with a number of providers e.g. Twitter, AD, Google etc. are included as part of the service for free.

F# and Web Apps

Use Web Apps for any application that is primarily HTTP-facing. You have several options for running F# within a Web App: –

  • Create a C# web project as a thin e.g. MVC veneer and delegate calls to your F# code.
  • Create an F# web project (either hand rolled or using one of the third-party templates available) and use something like Web API through the full ASP .NET stack or via OWIN / Katana. I believe that there’s also a full MVC F# template out there as well.
  • Deploy an alternative application that can respond to HTTP traffic e.g. Suave, and use a custom module in IIS to route traffic through that. There’s a working solution for this within my GitHub repo.

Service Fabric

As Cloud Services have aged, and containerised solutions have become more and more popular e.g. Docker, Microsoft have looked to create a more modern Azure service for hosting your code. Service Fabric allows you to host any executable (.NET or not) – so in this respect it’s somewhat like a Cloud Service. However, unlike a Cloud Service, a Service Fabric cluster is made of a number of nodes, on which you can host a number of services. In Microsoft’s mind, this allows you to create a whole host of micro-services hosted on your SF cluster. Each service within an SF cluster will be replicated across the nodes of the cluster – you can shape this behaviour with a reasonable level of control e.g. singleton services / n instances / maximum i.e. = cluster size. If a node goes down another one will spin up automatically and rehydrate its state, and you can also dynamically resize a cluster to introduce new nodes as required.

In addition to the usual Azure services that you may want to connect to, you can also use a number of SF-specific services that are hosted and run within the context of your cluster. These are services to allow sharing data in a consistent manner, and messaging across services in the cluster. These include creating .NET collections that work in a consistent, transactional manner across the cluster (sort of like the Concurrent collections in .NET but distributed across nodes rather than threads). So if you can imagine a micro-service which needs to maintain some state, rather than resorting to a database you could simply use an e.g. out-of-the-box distributed list.

Whilst Service Fabric is a relatively new public service offering on Azure, it’s actually been used internally by Microsoft for a number of services for a while one – so it’s a tried and tested technology.

It’s worth pointing out that in addition to the “host any raw executable” model, Service Fabric comes with an Actor model loosely based on the Orleans model. It’s not exactly the same, but a fairly close fit. But don’t think that Service Fabric is an actor service – it’s a much more flexible container model that happens to come with an actor model.

F# and Service Fabric

It’s important to know that out of the box, there is zero tooling support for Service Fabric with F#. There are no project templates, and no automated tooling for updating the myriad of XML configuration files or Powershell deployment tools that are required for Service Fabric. I’ve documented on this blog the steps that are (or were – that post is a little old now) required to get SF up and running with F# in terms of the project structure and tooling, and it is possible. I do wonder if there’s a more “lightweight” way to go about it though i.e. just deploy an executable with a lightweight project shell that hosts that exe, which would be decoupled from the SF project fluff. Currently the story that’s told by Visual Studio and Microsoft in terms of the configuration that’s required seems a bit of a backwards step to me.

Having said all that – once it’s up and running, Service Fabric offers a great story in terms of features for hosting services. It’s more powerful than Cloud Services, with out-of-the-box micro-service messaging and both stateless and stateful services that can replicate and scale automatically, as well as multi-tenanting services across single nodes.

If you’re think of hosting non-web facing code – whether micro services working together or arbitrary disconnected services – Service Fabric is a good choice. But it’s a more heavyweight option than e.g. Cloud Services in the sense that it contains a cluster of nodes (i.e. VMs) designed for hosting multiple services. It might be overkill to use it for a single service.

Conclusion

Azure offers a number of great developer-facing options for hosting F# applications, whether web-facing APIs or back-end services that receive requests via e.g. queues. There are also other compute options that I’ve not touched on here (such as Azure Batch), and more features coming all the time – so it’s definitely worth trying out to see how it can open up some new possibilities for your specific needs.

Visual Studio Team Services and FAKE


What is VSTS?

Visual Studio Team Services (VSTS) is Microsoft’s cloud-based source control / CI build / work item tracking system (with a nice visual task board). It’s a platform that is evolving relatively quickly, with lots of new features being added all the time. It also comes with a number of plans including a free plan which entitles you to an unlimited number of private repositories with up to 5 users (and MSDN users do not count), plus a fixed number of hours for centralized builds.

The catch is that there is (at least, I couldn’t find!) any way to host a completely public repository with unlimited users – obviously a big problem if you want to have an open source project with lots of contributors. But for a private team, it might be a good model for you. You can also use the CI build facilities of VSTS with GitHub repositories – so in this sense you can treat it as a competitor to something like AppVeyor perhaps.

Contrary to common opinion, VSTS is completely compatible with Git as a source control repository. Yes, you can opt to use TFS as a source control model, but (in my opinion) you’d have to be crazy to do this or have a team that are really used to the TFS way of working – I find Git to be a much, much more effective source control system.

Why FAKE?

I wanted to try and see whether it was possible to get VSTS working with FAKE.

One of the best things about FAKE is, in addition to the ease of use, flexibility and power that you get by creating build tasks directly within F# (and therefore with the full .NET behind it), is that because you are not dependent on hosting a bespoke build server with custom tasks etc., such as Team City, it’s extremely rare (hopefully never) that you have a situation where a build runs locally but fails to run on the build server.

Unlike relying on e.g. Team City to orchestrate your build, you delegate the entire CI build steps to FAKE – MSBuild, Unit Tests, configuration file rewriting etc. etc.. So if the build fails, you don’t have to e.g. log into the Team City box, check some log files etc. – your FAKE script does all the heavy lifting so you can run the exact same steps locally.

Putting it all together

So my goal was to write a simple FAKE build script which pulled down any dependencies, performed a build and ran unit tests – all integrated within VSTS. As it turns out, it wasn’t very difficult at all.

Firstly, we hook up the build to source control. In our case, it’s the Git repository of the Team Project, so works straight out of the box, but you can point to another Git repository e.g. GitHub as well. You can also select multiple branches. We then set a trigger to occur on each commit.

1.pngSecondly, we have to set up the actual build steps. As we’re delegating to FAKE to perform the whole build + tests, we want to use as few “custom” VSTS tasks as possible. In fact, we actually only need two steps.

  1. Some way to download Paket or Nuget, and then initiate the FAKE build.
  2. Some way of tieing in the results of XUnit that we’re going to run in FAKE into the VSTS test reports.

Unlike old-school TFS etc., VSTS now has an extensible and rich set of build tasks that you can chain together – no need for Workflow Foundation etc. at all here: –

3.png

Notice the “Batch Script” task above – perfect for our needs, as we can use it to perform our first build task to download Paket and then start FAKE.

We can now see what the FAKE script does – this is probably nothing more than what you would do normally with FAKE anyway to clean the file system, perform a build and then run unit tests: –

Notice that when we run unit tests, we also emit the results as an XML file. This is where the second build task comes into VSTS (Publish Test Results), which is used to parse the XML results and tie into VSTS’ build report.

2.png

So when we next perform a commit, we’ll see a build report that looks something like this: –

4.png

Notice that the chart on the right shows that I’ve run 2 unit tests that were successful – this is the second build task parsing the XUnit output. Of course we can also drill into the different stages if needed to see the output: –

5.png

Conclusion

This post isn’t as much about either VSTS or FAKE features per se, as it is about illustrating how both VSTS and FAKE are flexible enough that we can plug the two together. What’s great about this approach is that we’re not locked in to VSTS as a build system – we’re just using FAKE and running it centrally – but if we’re using VSTS we also can benefit from the integration that VSTS offers with e.g. Visual Studio and the build system e.g. creating work items, associating commits to work items and viewing from VS etc. etc. – whilst still using FAKE for our build.

F#, .NET and the Open Source situation


If you read my (generally sporadic) blog postings, you’ll know that in general I write about either F# and / or Azure, usually from a programmatic point-of-view. How easy is it to reason about a certain thing? How quickly can we make use of some Azure service from F#? And so on. In this post, as part of the annual F# Advent calendar, I want to switch focus and instead discuss an increasingly important areas of software development that I’ve been exposed to as part of my learning and adopting of F# – that of open source software development.

Disclaimer: There have been a number of debates on Twitter and GitHub regarding open source within the context of several Microsoft-led projects over the past few weeks. This post is not a response to those issues; this post was in fact authored in early November, well before any of these debates had been initiated.

The distinction between .NET communities

Coming from someone who has been a .NET developer almost exclusively throughout my career (aside from a brief stint with C++ and the obligatory various web and database languages that you get dragged into), one of the elements of working in F# that you can’t really ignore is the community involvement that permeates through just about everything that happens in and around F#, from the development of the language to key packages that are developed, through community meetups and user-groups, through to the tooling that is used on both “classic” .NET and Mono / CoreCLR stacks. Before I started working with F#, I had very little experience of open source development or tools etc. Thankfully, I’ve found the community to be extremely open, friendly, and always happy to help out others, as well as to continually improve the state of software development in F#. So if there’s a lack of tooling available for a specific purpose, or a tool that doesn’t quite do what is needed, the mentality in the F# community is to improve the existing tooling if possible, and where not, simply write a new tool from scratch.

This is a very different mentality in the rest of the .NET community, where we are by and large reliant on what Microsoft serves up both in terms of tooling, libraries and frameworks. This is not necessarily a failing of either the community or Microsoft – it’s just the way it is, and it’s something that has been fostered over the years by both parties. This model is slowly changing, and there are some popular OSS projects out there – JSON .NET being an example of something that is owned by the open source community and Microsoft themselves now use. Other examples of open source software within the .NET community include EventStore and Nancy. However, generally the average .NET developer is content to accept the pace of change, and direction, that Microsoft sets. This has historically not necessarily been a bad thing, but with the increasing rate of change of programming languages, frameworks and tools, it’s hard to see that remaining a successful model.

Misunderstanding the F# community

I think that these two different mindsets have (somewhat unfairly) earned the F# community a reputation of being somewhat disruptive and of not “going with the flow”. It’s true that some F# projects (notably type providers) are not compatible with C# and VB .NET. Yet it’s often forgotten (or not even known) that many open source projects that happen to have been written in F# work perfectly well in C# and VB (and often are explicitly designed with that in mind). Some examples include: –

  • Paket – a dependency manager designed to work well with NuGet packages and GitHub repositories
  • Fake – a build automation system using a DSL built in F#
  • FSCheck – a property-based testing framework based on Haskell’s QuickCheck
  • Project Scaffold – a build template with everything needed for successful organisation of code, tools and publishing.

It’s sometimes disappointing to hear well-known people throughout the .NET and Microsoft community often misconstrue projects like these as somehow to only be of use to F# users.

At the same time, there is an element of smugness and / or superiority that sometimes permeates from some corners of the F# community, which I think isn’t all that helpful. To an extent though, rather been being an essentially negative act, this is caused by the enthusiasm of the community and its desire to improve the state of software development within .NET, and perhaps also by the frustration caused by some of the misunderstandings and FUD about F# in the public domain e.g. it’s only for maths people, it’s too hard to learn, it’s not of use in line-of-business applications etc..

Defining open source software

I’d like to discuss a little about open source software now. Frankly, I would not consider myself an expert in defining what open source really is, but having now spent a few years now in the F# community contributing to a few different projects, I certainly have a better idea than I used to, and do feel qualified to at least put down some opinions. I’m taking a leap of faith here that my experiences in F# open source projects reflects positively on open source as a whole, because despite any  criticisms that can be leveled at the F# community and ecosystem, it really has a fantastic approach to open source, collaborative software development that can be learned from. Here are some points of interest – and misconceptions – that I’ve observed from looking at a number of projects across the .NET community, both in and outside of the F# community.

GitHub

GitHub is a great site, and has become the de-facto repository site for most F# open projects, as well as even Microsoft’s attempts to move into the open source world. However, creating a code repository on GitHub (or moving an existing code repository from CodePlex to GitHub) does not instantly make a project “open source”. I’ll say that again, because it’s worth making a point of this. Just the act of putting some code on GitHub does not mean it is an open source project. If it were, it’d be easy to make a project open-source. However, I’ve learned that working on truly open source software is much more than simply making the repository public for everyone to look at.

Cross Platform .NET

I also see this mention of “open source” as somehow being a catch-all term for software developed for Mac and Linux (particularly from some Microsoft quarters). This is a mistake. It suggests that software that is developed on (or for) Windows doesn’t need to be open source, or is somehow different in terms of how software is developed. It’s great that Microsoft are moving to a more open model, and adopting CoreCLR for running .NET code cross platform – but, again, just making software run on Mac and / or Linux does not immediately elevate a project to being successfully “open source”.

Sharing is Caring

But the biggest challenge I see in creating a successful open source project is to adopt a truly collaborative, pro-active approach to software development. This means being genuinely interested in external feedback on your project – even if it’s not what you want to hear. It means encouraging people to submit pull requests. It might also mean giving up complete ownership of the project where you might not agree with every feature proposed by the community on the project, or feel comfortable with strangers submitting PRs to your project written in a different coding style to what you are used to. It also means getting feedback in at an early a stage as possible – ideally at the planning stage of any feature – and accepting help from the community in shaping the design and implementation of said feature. This is not the same as working on a project for several months, releasing it to the public in GitHub, and then getting feedback on it retrospectively.

It’s only through adopting a really open, collaborative approach to software that you can have a cycle of < 48 hours from someone submitting a feature request, to several people giving feedback on the feature, to someone implementing it.

Microsoft, .NET and Open Source

Obviously, Microsoft have taken a big step in the large year or so to try to adopt “open source” development processes. This is to be applauded. But they’re not there yet, and there is some evidence which lead me to believe that it will take them a while to truly get open source – but they are trying. Some of the teams seem to have gotten the idea of getting feedback early e.g. the C# compiler team. Ironically, a mature programming language like C# is one type of project that you probably need more control and direction over than e.g. a library, where the barrier to entry is much lower. Conversely, there are also a number of teams at Microsoft developing frameworks and libraries that are ostensibly open source, yet I’m really not seeing that much in the way of a collaborative mentality except for the superficial steps of putting a repository onto GitHub and asking feedback after a period of time spent developing in isolation. Releases and features and still managed with a reactive rather than proactive mentality. PRs from non-Microsoft team members are few and far between. Direction in many of these projects is controlled almost entirely by Microsoft. In short, the general open source community aren’t yet empowered to contribute to these projects in a way that I would like to see.

Conclusion

An important element of the distinction between the F# community and the rest of .NET is likely due to the differences in terms of investment in both F# and C#/VB.NET stacks by Microsoft – the fact that F# has been forced to stand on its own feet has helped it mature extremely quickly. In addition, it should be mentioned that there have been a number of key individuals within the F# community who have helped both grow and nurture the community – I’m not sure just how F# would be looking  today without them. Ultimately, what we do have today with F# is a growing global community, and an ecosystem of tools and libraries which takes the best of existing .NET packages combined with others that harness the power of F# as well. It’s a community which includes the Microsoft Visual F# team as a valued member, rather than a all-controlling entity, and a community  which – whilst still growing and evolving – is open, confident and in control of its own destiny.

Thanks to the community for showing me all of this over the past few years, and allowing me contribute towards a fun, vibrant and growing ecosystem. If you want to learn about open source projects, and secondly about F# in general, you could do a lot worse than look at some of the up-for-grabs issues of the projects on http://fsprojects.github.io/.

What on earth has happened to NuGet?


After several months away from NuGet, I had to use it again recently in VS2015. I’m completely and utterly gobsmacked at how poor the current experience is. It’s confusing, inconsistent and hard to use. Worse than that, it enables workflows that should never, ever be permitted within a package management system.

First experiences with the NuGet dialog

When I first saw the previews of VS2015, I thought that the non-modal, integrated dialog into VS would be a good thing. Unfortunately, it suffers from not making it clear enough what the workflow is. You’re bombarded with dropdowns that sometimes have only one option in them, search boxes in non-intuitive positions, installation options that you don’t want to see etc.

Let’s start by adding a package to a solution. This is a common task so should be as easy as possible. Here I’m adding my old friend Unity Automapper to two projects in a solution (note that I’ve deliberately downgraded to an earlier version of the Automapper) :-

nuget1Why so many options? Why a dropdown for Action that only has one item in it? What does the “Show All” checkbox do – looks like nothing to me at this point. Are there projects hidden from the list? Why?

Working with packages

Next let’s try to do something to the package e.g. uninstall or update. You go to Manage Packages for Solution. You don’t see the packages that you’ve already – instead, you see all packages available, ordered by (I assume) most popular download. My installed package is nowhere to be seen. How do I find installed packages only? Oh. You have to click “Filter” and then select “Installed”. That’s not what I thought of doing when I selected the menu option. I thought it would just show it to me – it took me a good few seconds to realise what had happened!

Also notice that the default dependency behaviour is “lowest” – I don’t like this. Highest should be the default, or maybe Highest Minor. Nine out of ten times you’ll just have to upgrade them all immediately afterwards anyway. Which brings me nicely onto the subject of upgrading NuGet packages…

Upgrading dependencies

So now I’ve found my package, I want to upgrade it. So I choose to manage packages for solution, and change the action from Uninstall to Update. It picks the latest version available (good).

nuget2But then notice in the screenshot above it allows me to uncheck some of the projects within the solution. What?? Why would you want to explicitly upgrade only some of the projects within the solution! I can maybe think of some corner cases, but generally this is a definitely no-no – a recipe for absolute disaster. At best you’ll muddle on with some binding redirects, at worst you’ll get a runtime error at some indeterminate time in the future when you try to call a method that doesn’t exist. What happens if you have a breaking change between the two? Which version will get deployed? Can you be sure? Don’t do this. Ever. I can’t even understand why NuGet offers you as the user to do this.

Even worse, if you choose to update a NuGet dependency at the project level (by e.g. right clicking the references node in VS), VS will upgrade it on that project alone. The other projects that have that dependency will be left untouched. You’ll end up with different versions of a dependency in a solution without you even realising it.

But let’s see what happens if we decide to live dangerously and go ahead anyway. Firstly, NuGet will happily upgrade half our solution and leave the other half in an old state – no warnings of impending doom, no nothing. Your code might work, it might not. Maybe it’s work fine for some weeks, and then you’ll realise that the upgrade has a breaking change, but you only notice this when project A tries to call a method that no longer exists.

The next time you next enter the NuGet panel to work with your dependencies – e.g. to uninstall the dependency, you’ll see the following: –

Nuget3Why is only one of my two projects showing in the list of projects that I want to uninstall? Because it’s filtering it on that specific version of the dependency (1.1.0). I have to choose the version that I want to uninstall. I don’t want that – I just want to uninstall the entire dependency. But no, we have to do it version by version. If we now change the Action to “Update”, the Version dropdown suddenly has a different meaning. It no longer acts as a filter – it’s now stating what version we’ll upgrade to. Instead, you now filter based on the projects checked in the list below. So the meaning of each user control has changed based on the action you’re performing. This is not good UI design.

UI Thoughts

In fact, the UI as a whole is really, really confusing. There are also other issues – it’s slow. You can’t upgrade all projects through the UI yet, so either you have to fall back to the Package Manager console, or upgrade each package one by one. Of course, after you do any single update, the UI reverts back to showing “All” packages which involves loading the top results from NuGet. This takes a few seconds. Then you need to change the dropdown to show “Installed” packages, and start all over again.

Conclusion

I’m really worried having seen the new NuGet now. It’s been in VS2015 for a while – and developers are putting up with this? It’s slow. It’s not a well designed UI – even I can tell that. You don’t feel like you know what is actually going to get deployed. The workflows don’t really work. It’s not good.

And it’s not just the UI that concerns with me NuGet. It’s also the fundamental structure underneath it that is unchanged and needs to be fixed. NuGet should be managing dependencies across an entire solution as the default and ensuring that dependencies across projects are stable. Instead, we have individual package.config files on each project, each with their own versions of each dependency. This is not what you want! As illustrated above, it’s very possible – and in fact quite likely on a larger project – that you’ll quickly end up with dependencies across projects with different versions. I’ve seen it lots of times. You run the risk of getting runtime bugs or crashes, and worse still, ones that might only be identified when you go down a specific code path in your app, probably the day after it goes live.

I was hoping that NuGet in VS2015 would start to address these issues, but unfortunately it seems like a large step backwards at the moment.

MBrace, CloudFlows and FSharp.Data – data analysis made easy


In case you’ve not seen it before, MBrace is a simple programming model for scalable cloud data scripting and programming with .NET. It’s written in F#, but has growing support for C# and VB .NET. Over the past year or so, I worked closely with the MBrace team to help get it working smoothly on Microsoft Azure, using features such as Service Bus and Storage to provide an excellent development and deployment experience. As MBrace gears up for a v1 release, the design of the API is looking extremely positive.

I’m going to demonstrate here a simple example that illustrates how easy it is to start working with a large CSV file available on the internet in an MBrace cluster, parsing and querying data easily – we’re going to analyse UK house prices over the past year (this file is freely available on the gov.uk website).

I’m going to assume that you have an MBrace cluster up and running – if you don’t, you can either use a local development cluster or download the latest source code and deploy a full cluster onto Azure using the example MBrace Worker Role supplied in the MBrace Azure source code.

Type Providers on MBrace

We’ll start by generating a schema for our data using FSharp.Data and its CSV Type Provider. Usually the type provider can infer all data types and columns but in this case the file does not include headers, so we’ll supply them ourselves. I’m also using a local version of the CSV file which contains a subset of the data (the live dataset even for a single month is > 10MB): –

In that single line, we now have a strongly-typed way to parse CSV data. Now, let’s move onto the MBrace side of things. I want to start with something simple – let’s get the average sale price of a property, by month, and chart it.

A CloudFlow is an MBrace primitive which allows a distributed set of transformations to be chained together, just like you would with the Seq module in F# (or LINQ’s IEnumerable operators for the rest of the .NET world), except in MBrace, a CloudFlow pipeline is partitioned across the cluster, making full use of resources available in the cluster; only when the pipelines are completed in each partition are they aggregated together again.

Also notice that we’re using type providers in tandem with the distributed computation. Once we call the ParseRows function, in the next call in the pipeline, we’re working with a strongly-typed object model – so DateOfTransfer is a proper DateTime etc. All dependent assemblies have automatically been shipped with MBrace; it wasn’t explicitly designed to work with FSharp.Data – it just works. So now that we have an array of int * float i.e. month * price, we can easily map it on a chart: –

MBrace1Easy.

Persisted Cloud Flows

Even better, MBrace supports something called Persisted Cloud Flows (known in the Spark world as RDDs). These are flows whose results are partitioned and cached across the cluster, ready to be re-used again and again. This is particularly useful if you have an intermediary result set that you wish to query multiple times. In our case, we might persist the first few lines of the computation (which involves downloading the data from source and parsing with the CSV Type Provider), ready to be used for any number of strongly-typed queries we might have: –

So notice that the first query takes 45 seconds to execute, which involves downloading the data and parsing it via the CSV type provider. Once we’ve done that, we persist it across the cluster in memory – then we can re-use that persisted flow in all subsequent queries, each of which just takes a few seconds to run.

Conclusion

MBrace is on the cusp of a 1.0 release – it’s ready for you to start using now, and offers not only a powerful and flexible set of abstractions for distributed computations, but as you can see from above, if you’ve used the collection libraries in F# before it’s a very smooth transition to make the leap to distributed collection queries. In less than ten lines of code, you can start writing distributed queries against live datasets with the minimum of effort.

Stateless services on Azure Service Fabric in F#


In my previous posts, I discussed the use of the Service Fabric (SF) actor framework (which is loosely based on Orleans) and F#, and how we can use FP features within an actor model, even one designed for OO languages.

Exposing Services with Service Fabric

Ironically, the actor framework with SF is one of its more complex features – you can use SF to host literally any .NET code you want. There are a number of features within SF designed to allows you to rapidly host scalable systems, with support for state replication out-of-the-box. In this post, I want to illustrate the steps needed to host the F#, FP-first web server Suave in Service Fabric. It turns out that there’s really not much code needed at all.

  1. We create an F# executable that is compatible with Service Fabric.
  2. We create a service that inherits from the StatelessService class (we’ll discuss Stateful Services in another post).
  3. We override the CreateCommunicationListener method. This is important – essentially this method’s responsibility is to create an object that can handle incoming traffic from external sources. We also make a note of the port that Suave will be running on.
  4. We configure an endpoint in the Service Fabric configuration for that same port. This tells SF to allow inbound traffic in. This is roughly analogous to opening up an endpoint in Cloud Services. This is something you should have also specified when creating the cluster itself in Azure (if not, you’ll need to manually configure the load balancer to allow traffic through).
  5. In our Main program, we register the service with SF.

The key part is (3), where we implement the functionality that should get called to handle incoming requests. It’s pretty basic really: –

So CreateCommunicationListener() expects an instance of ICommunicationListener that will create the web server for us. Luckily with F#’s object initializers we don’t even have to declare a formal type – we can simply create the object on the fly. As you can see, all it does is start up Suave using default settings. You might elect to supply the port that it starts on from the endpoint configuration in Service Fabric – this is done in the Initialize method, and is included in the full sample.

Once done, you can configure the scalability of the service in config – if you want three instances, just set the instance attribute to 3 in the ApplicationManifest file of your hosting Service Fabric application. If you want it on every node, you set the attribute to -1 (because we all know that -1 is the universal standard for “absence of a number” – we don’t need no option types ;-). Note that running this locally with multiple instances won’t work, since they all try to run on the same port, but in the real world it’d work fine I’m sure.

As an aside, if you want any arbitrary service that doesn’t necessarily need incoming traffic e.g. something subscribing to service bus or writing to a DB, you don’t have to implement anything regarding ICommunicationListener. There’s simply a RunAsync() method that you can put any code inside that you want.

So, there you have Suave in Service Fabric with a minimal amount of code. For this you’ll get an auto-load balancing, scalable and automatically healing service. In my next post, I’ll demonstrate StatefulServices and how we can use them to automatically manage state across a cluster of services.

Building Azure Service Fabric Actors with F# – Part 2


In Part 1, I provided an overview of what Service Fabric (SF) is, and provided some step-by-step guidance on how to get up and running with the Service Fabric local installation. In this post, I want to move from the infrastructure to the code, and show how we can use F# with an Actor model designed primarily for C# and VB .NET, whilst still retaining an idiomatic F# feel where possible.

All code for the full sample used as the basis for this series is available here.

Actors in Service Fabric

Firstly, I’ll show you some elided examples of how we modelled some features of my cat as an Actor in Service Fabric. Every cat has some state which is affected by actions it does, and needs to be persisted across calls. In Service Fabric, we call this a “stateful” actor. After every “state-updating” action (in SF terms, this equates to a method call on the actor), SF will automatically persist your state back to disk and automatically replicate to other nodes in the SF cluster (typically at least two others); if your primary node goes down, one of the secondaries will immediately take over and the failed node will be silently replaced in the background. You can also have so-called “read only” actions, which do not modify state but typically return some payload to the caller. You can typically think of these as “getter” methods / properties on a class. You’ll normally have a mix of both state-mutating and read-only methods on a given actor.

Implementing Stateful Actors in F#

Every stateful Actor in SF inherits from the type Actor<T>, where T is the state that needs to be persisted. It shows up as a member property on the actor, State. Service Fabric will automatically create one of these when starting every given actor, and silently persist / load it across calls etc.

We’ll start by modelling the state on the Actor by default with a standard OO class in F# – see below. Notice the DataContract and DataMember attributes – these are used by the persistence layer of SF to de/re-hydrate state to an Actor. Personally I’m not particularly fond of these attributes – there are plenty of serialization frameworks out there that seem to work just fine without decorating every single property, so why are we stuck with this old-school approach? Perhaps there’s a way to replace the serialization in SF – I haven’t tried yet.

Anyway, here’s an example method on Cat, called Jump(). It takes in a destination of where the cat is jumping to, and depending on the destination, this affects the cat – and the owner’s – Happiness (in a more fully featured model, the owner themselves would probably be an actor with their own state). The cat will also work up an appetite by Jumping(). Hunger can be alleviated by Feeding() the cat.

On the one hand, F# works nicely with interfaces – we still don’t have to specify types, as they are inferred from the interface we’re implementing. However, this sample is still somewhat unsatisfactory to me as an F#-first person: I’m used to creating copies of data from other data, not mutating it. I also don’t like this approach of modifying state in several places arbitrarily – I feel uneasy when seeing code like this. It seems very statement oriented, with side effects everywhere – something I struggle to reason about easily. There must be something better!

Use immutable data structures on Actors

As it turns out, there is. Notice that up until now we’ve basically written everything in an OO style, using standard C#/ VB constructs like classes etc. – we’ve not used any F# types. We can actually use many F# features without too much fuss, and they can quickly help us out in our quest to getting back to sane and easy-to-reason-about code.

Firstly, we can change the way we model our state from a class to an F# record. This actually works without any problem, once you do the same WCF-style attribute decoration, and add the [<CLIMutable>] attribute – this is necessary as although Records boil down to standard Classes, by default there’s no public setter on any properties, so SF can’t rehydrate state by default. We can also add in other F#-only features, like units of measure, if we want – as these are a compile-only feature, there’s no issue with serialization of them.

On their own, using records within SF only works up to a point – we’re forced to make copies of state, rather than mutating the single attributes of the State member multiple times, which is a good thing. However, it still looks undesirable – we’re now just mutating the State member property on the Actor instead! Plus it’s not clear when and where we should replace the contents of the State member within the method – every time? Once at the end of the method call? Something in between?

Adapting functional patterns into Actors

Let’s take a step back and think about the two types of methods I mentioned earlier on – state-updating and read-only calls. The former intends to do some processing, and update the State of the actor. The latter typically reads from the State and returns some data to the caller (I’m setting aside things like calling external dependencies etc. which for simplifies’ sake we can ignore – plus it really doesn’t affect us here as we would partially apply our functions with dependencies). We can formally specify such actions and implement them with something like this: –

Notice how now our functions are much simpler – Jump is made up of a single expression that generates the new State of the Actor, based on the input state and distance – we’re no longer mutating state multiple times, or even once. And because State is an immutable record, it’s impossible to modify the supplied input State ever.

Plugging pure functions into Actors

Now that we’ve formalised how we see our actor methods working, we can re-write our earlier code from the anything-goes, mutate-everywhere C# style to one that is easier to test, easier to reason about and more idiomatic from an FP, F# point of view. You’ll notice that the implementation code above is back in a module – so how do we plug this into our OO Actor model?

There are a few ways, but the easiest one is with the help of a couple of shim functions that tightly control the mutation of the Actor State, whilst delegating control to our purely functional code for business logic. Our core code is kept free from worrying about the mutation of state and is performed in a consistent manner; our SF Actor model simply delegates to them.

A word on Read-Only Service Fabric methods

Another point worth mentioning are Read Only methods in Service Fabric. These are methods that you, as the developer, tell the SF runtime “I will never amend state in this method – don’t try to persist state at the end of the call”. This is achieved in SF simply by placing the [<Readonly>] attribute on the method. I don’t like this much for two reasons. Firstly, the attribute differs from the System.ComponentModel [<ReadOnly>] attribute simply by virtue of the fact that it has a different casing on one of the characters in the type. Use the wrong one accidentally and things will quickly go pop with your actor (believe me – I did it during the creation of the code referenced in this post; the error that you get isn’t helpful either). The other, more dangerous issue is that there is no compile time safety around the use of the [<Readonly>] attribute. If you decide to start changing state in one of these calls – tough. You won’t get any support from the compiler, nor from the runtime. Your method simply won’t update state and you’ll be left wondering why your application isn’t behaving correctly.

With the “adapt to a functional style” approach, whilst we don’t eliminate the issue completely – you still have to decorate the methods appropriately – we at least get compile-time checking on read-only functions, because they don’t allow us to return state; you therefore can’t accidentally modify the state of an actor. In addition, because we’re now using records, which are themselves immutable, it’s impossible for us to modify the state that was supplied to us.

For a simple example like the one supplied, one could argue that the extra delegation and modules etc. complicates matters compared to e.g. C# / OO. However, once you start writing even mildly complicate business logic, it quickly becomes a tiny cost compared to the simplification you benefit from through immutability, records etc.. as well as the usual other benefits of F#.

Taking it further

You can take this approach even further – in other actor frameworks, rather than adopting the “method-per-action” approach, a more functional approach is to have a single message which is itself a discriminated union containing all the different messages ; we then pattern match on this in order to process the message appropriately. We can apply this sort of pattern for updating-state messages, although it isn’t exactly idiomatic SF actor code (I’ve supplied an example in the source code).

Another alternative might be to create a custom Computation Expression (perhaps similar to the Writer monad that Tomas Petricek blogged about many moons ago) in order to make this modification to state even more succinct. Perhaps someone could write one ;-)

Conclusion

We’ve seen how we can marry up some features inherent to the F# type system in order to enforce a cleaner way of reasoning about the code that our actors have to implement, through a couple of simple function signatures and some simple adaptors. We’ve also seen how F#, and typical FP paradigms, can be used in an reliable and distributable framework designed for a mutable-first OO consumer.

In part three, I want to illustrate how we can quickly and easily host arbitrary services on top of Service Fabric in F# for just about any code you might want to write, and how we can easily scale it to large volume.