Introducing Type Providers

abraham_fsharp_hiresmeap

Note: This article has been excerpted from my upcoming book, Learn F# . Save 37% off with code fccabraham!

Welcome to the world of data! This article will:

  • Gently introduce us to what type providers are
  • Get us up to speed with the most popular type provider, FSharp.Data.

What Are Type Providers?

Type Providers are a language feature first introduced in F#3.0:

An F# type provider is a component that provides types, properties, and methods for use in your program. Type providers are a significant part of F# 3.0 support for information-rich programming.

https://docs.microsoft.com/en-us/dotnet/articles/fsharp/tutorials/type-providers/index

At first glance, this sounds a bit fluffy – we already know what types, properties and methods are. And what does “information-rich programming” mean? Well, the short answer is to think of Type Providers as T4 Templates on steroids – that is, a form of code generation, but one that lives inside the F# compiler. Confused? Read on.

Understanding Type Providers

Let’s look at a somewhat holistic view of type providers first, before diving in and working with one to actually see what the fuss is all about. You’re already familiar with the notion of a compiler that parses C# (or F#) code and builds MSIL from which we can run applications, and if you’ve ever used Entity Framework (particularly the earlier versions) – or old-school SOAP web services in Visual Studio – you’ll be familiar with the idea of code generation tools such as T4 Templates or the like. These are tools which can generate C# code from another language or data source.

figure1
Figure 1 – Entity Framework Database-First code generation process

Ultimately, T4 Templates and the like, whilst useful, are somewhat awkward to use. For example, you need to attach them into the build system to get them up and running, and they use a custom markup language with C# embedded in them – they’re not great to work with or distribute.

At their most basic, Type Providers are themselves just F# assemblies (that anyone can write) that can be plugged into the F# compiler – and can then be used at edit-time to generate entire type systems for you to work with as you type. In a sense, Type Providers serve a similar purpose to T4 Templates, except they are much more powerful, more extensible, more lightweight to use, and are extremely flexible – they can be used with what I call “live” data sources, as well as offering a gateway not just to data sources but also to other programming languages.

figure2
Figure 2 – A set of F# Type Providers with supported data sources

Unlike T4 Templates, Type Providers can affect type systems without re-building the project, since they run in the background as you write code. There are dozens, if not hundreds of Type Providers out there, from working with simple flat files such as CSV, to SQL, to cloud-based data storage repositories such as Microsoft Azure Storage or Amazon Web Services S3. The term “information-rich programming” refers to the concept of bringing disparate data sources into the F# programming language in an extensible way.

Don’t worry if that sounds a little confusing – we’ll take a look at our first Type Provider in just a second.

Quick Check

  1. What is a Type Provider?
  2. How do type providers differ from T4 templates?
  3. Is the number of type providers fixed?

Working with our first Type Provider

Let’s look at a simple example of a data access challenge – working with some soccer results, except rather than work with an in-memory dataset, we’ll work with a larger, external data source – a CSV file that you can download at https://raw.githubusercontent.com/isaacabraham/learnfsharp/master/data/FootballResults.csv. You need to answer the following question: which three teams won at home the most over the whole season.

Working with CSV files today

Let’s first think about the typical process that you might use to answer this question: –

figure3
Figure 3 – Steps to parse a CSV file in order to perform a calculation on it.

Before we can even begin to perform the calculation, we first need to understand the data. This normally means looking at the source CSV file in something like Excel, and then designing a C# type to “match” the data in the CSV. Then, we do all of the usual boilerplate parsing – opening a handle to the file, skipping the header row, splitting on commas, pulling out the correct columns and parsing into the correct data types, etc. Only after doing all of that can you actually start to work with the data and produce something actually valuable. Most likely, you’ll use a console application to get the results, too. This process is more like typical software engineering – not a great fit when we want to explore some data quickly and easily.

Introducing FSharp.Data

We could quite happily do the above in F#; at least using the REPL affords us a more “exploratory” way of development. However, it wouldn’t remove the whole boilerplate element of parsing the file – and this is where our first Type Provider comes in – FSharp.Data.

FSharp.Data is an open source, freely distributable NuGet package which is designed to provide generated types when working with data in CSV, JSON, or XML formats. Let’s try it out with our CSV file.

Scripts for the win

At this point, I’m going to advise that you move away from heavyweight solutions and start to work exclusively with standalone scripts – this fits much better with what we’re going to be doing. You’ll notice a build.cmd file in the learnfsharp code repository (https://github.com/isaacabraham/learnfsharp). Run it – it uses Paket to download a number of NuGet packages into the packages folder, which you can reference directly from your scripts. This means we don’t need a project or solution to start coding – we can just create scripts and jump straight in. I’d recommend creating your scripts in the src/code-listings/ folder (or another folder at the same level, e.g. src/learning/) so that the package references shown in the listings here work without needing changes.

Now you try

  1. Create a new standalone script in Visual Studio using File -> New. You don’t need a solution here – remember that a script can work standalone.
  2. Save the newly created file into an appropriate location as described in “Scripts for the win”.
  3. Enter the following code from listing 1:
// Referencing the FSharp.Data assembly
#r @"..\..\packages\FSharp.Data\lib\net40\FSharp.Data.dll"
open FSharp.Data

// Connecting to the CSV file to provide types based on the supplied file
type Football = CsvProvider< @"..\..\data\FootballResults.csv">

// Loading in all data from the supplied CSV file
let data = Football.GetSample().Rows |> Seq.toArray

That’s it. You’ve now parsed the data, converted it into a type that you can consume from F# and loaded it into memory. Don’t believe me? Check this out: –

figure4
Figure 4 – Accessing a Provided Type from FSharp.Data

You now have full intellisense to the dataset – that’s it! You don’t have to manually parse the data set – that’s been done for you. You also don’t need to “figure out” the types – the Type Provider will scan through the first few rows and infer the types based on the contents of the file! In effect, this means that rather than using a tool such as Excel to “understand” the data, you can now begin to use F# as a tool to both understand and explore your data.

Backtick members

You’ll see from the screenshot above, as well as from the code when you try it out yourself, that the fields listed have spaces in them! It turns out that this isn’t actually a type provider feature, but one that’s available throughout F# called backtick members. Just place a double backtick (“) at the beginning and end of the member definition and you can put spaces, numbers or other characters in the member definition. Note that Visual Studio doesn’t correctly provide intellisense for these in all cases, e.g. let-bound members on modules, but it works fine on classes and records.

Whilst we’re at it, we’ll also bring down an easy-to-use F#-friendly charting library, XPlot. This library gives us access to charts available in Google Charts as well as Plotly. We’ll use the Google Charts API here, which means adding dependencies to XPlot.GoogleCharts (which also brings down the Google.DataTable.Net.Wrapper package).

  1. Add references to both the GoogleCharts and Google.DataTable.Net.Wrapper assemblies. If you’re using standalone scripts, both packages will be in the packages folder after running build.cmd – just use #r to reference the assembly inside one of the lib/net folders.
  2. Open up the GoogleCharts namespace.
  3. Execute the following code to calculate the result and plot them as a chart.
data
|> Seq.filter(fun row ->
    row.``Full Time Home Goals`` > row.``Full Time Away Goals``)
|> Seq.countBy(fun row -> row.``Home Team``)
|> Seq.sortByDescending snd
|> Seq.take 10
|> Chart.Column
|> Chart.Show
// countBy generates a sequence of tuples (team vs number of wins)
// Chart.Column converts the sequence of tuples into an XPlot Column Chart
// Chart.Show displays the chart in a browser window
figure5
Figure 5 – Visualising data sourced from the CSV Type Provider

In just a few lines of code, we were able to open up a CSV file we’ve never seen, explore the schema of it, perform some operations on it, and then chart it in less than 20 lines of code – not bad! This ability to rapidly work with and explore datasets that we’ve not even seen before, whilst still allowing us to interact with the full breadth of .NET libraries that are out there gives F# unparalleled abilities for bringing in disparate data sources to full-blown applications.

Type Erasure

The vast majority of type providers fall into the category of erasing type providers. The upshot of this is that the types generated by the provider exist only at compile time. At runtime, the types are erased and usually compile down to plain objects; if you try to use reflection over them, you won’t see the fields that you get in the code editor.

One of the downsides is that this makes them extremely difficult (if not impossible) to work with in C#. On the flip side, they are extremely efficient – you can use erasing type providers to create type systems with thousands of types without any runtime overhead, since at runtime they’re just of type Object.

Generative type providers allow for run-time reflection, but are much less commonly used (and from what I understand, much harder to develop).

If you want to know more, download the free first chapter of Learn F# and see this Slideshare presentation. Don’t forget to save 37% with code fccabraham.

Modelling State in F#

abraham_fsharp_hiresmeap

This article has been excerpted from Learn F#

Working with mutable data

Working with mutable data structures in the OO world follows a simple model — you create an object, and then modify its state through operations on that object.

Figure 1 – Mutating an object repeatedly

What’s tricky about this model is that it can be hard to reason about your code. Calling a method like UpdateState() above will generally have no return value; the result of calling the method is a side effect that takes place on the object.

Now you try

Let’s now put this into practice with an example — driving a car. We want to write code that allows us to drive() a car, tracking the amount of petrol used; the distance we drive determines the total amount of petrol used.

let mutable petrol = 100.0 // initial state

let drive(distance) = // modify through mutation
    if distance = “far” then petrol <- petrol / 2.0
    elif distance = “medium” then petrol <- petrol — 10.0
    else petrol <- petrol — 1.0

drive(“far”) // repeatedly modify state
drive(“medium”)
drive(“short”)

petrol // check current stat

Working like this, it’s worth noting a few things: –

  1. Calling drive() has no outputs. We call it, and it silently modifies the mutable petrol variable — we can’t know this from the type system.
  2. Methods aren’t deterministic. You can’t know what the behaviour of a method is without knowing what the (often hidden) state is, and if you call drive(“far”) 3 times, the value of petrol will change every time, depending on the previous calls.
  3. We’ve no control over the ordering of method calls. If you switch the order of calls to drive(), you’ll get a different answer.

Working with immutable data

Let’s now compare that with working with immutable data structures.

Figure 2 – Generating new state working with immutable data

In this mode of operation, we can’t mutate data. Instead, we create copies of the state with updates applied, and return that to the caller to work with; that state may be passed in to other calls that generate a new state yet again. Let’s now rewrite our code to use immutable data.

// Function explicitly dependent on state — takes in petrol and 
// distance, and returns new petrol
let drive(petrol, distance) = 
    if distance = “far” then petrol / 2.0
    elif distance = “medium” then petrol — 10.0
    else petrol — 1.0

let petrol = 100.0 // initial state

// storing output state in a value
let firstState = drive(petrol, “far”)
let secondState = drive(firstState, “medium”)

// chaining calls together manually
let finalState = drive(secondState, “short”)

We’ve made a few key changes to our code. The most obvious is that we aren’t using a mutable variable for our state any longer, but a set of immutable values. We “thread” the state through each function call, storing the intermediate states in values, which are manually passed to the next function call. Working in this manner, we gain a few benefits immediately.

  1. We can reason about behaviour more easily. Rather than hidden side effects on private fields, each method or function call can return a new version of the state that we can easily understand. This makes unit testing much easier, for example.
  2. Function calls are repeatable. We can call drive(50, “far”) as many times as we want, and it’ll always give us the same result. This is known as a pure function. Pure functions have useful properties, such as being able to be cached or pre-generated.
  3. The compiler protects us, in this case, from accidentally mis-ordering function calls, because each function call is explicitly dependent on the output of the previous call.
  4. We can see the value of each intermediate step as we “work up” towards the final state.

Passing immutable state in F#

In this example, you’ll see that we’re manually storing intermediate state and explicitly passing that to the next function call. That’s not strictly necessary, as F# has language syntax to avoid having to do this explicitly.

Now you try

Let’s try to make some changes to our drive code.

  1. Instead of using a string to represent how far we’ve driven, use an integer.
  2. Instead of “far”, check if the distance is more than 50.
  3. Instead of “medium”, check if the distance is more than 25.
  4. If the distance is > 0, reduce petrol by 1.
  5. If the distance is 0, make no change to the petrol consumption. Return the same state that was provided.

Other benefits of immutable data

A few other benefits that aren’t necessarily obvious from the above sample: –

  1. When working with immutable data, encapsulation isn’t necessarily as important as it is when working with mutable data. Sometimes encapsulation is still valuable, e.g. as part of a public API — but there are occasions where making your data read-only removes the need to “hide” your data;
  2. Multi-threading. One of the benefits of working immutable data is that you don’t need to worry about locks within a multi-threaded environment. Because there’s never any shared mutable state, you don’t need to be concerned with race conditions — every thread can access the same data as often as necessary, without change.

Performance of immutable data

I often hear this question — isn’t it much slower to constantly make copies rather than modify a single object? The answer is: yes and no. Yes, it’s slower to copy an object graph than make an in-place update. Unless you’re in a tight loop, performing millions of mutations, the cost of doing it is neglible compared to opening a database connection. Plus, many languages (including F#) have specific data structures designed to work with immutable data in a highly performant manner.

If you want to learn more about F#, go download the free first chapter of Learn F# and see this Slideshare presentation for more information and a discount code.

Learn F# for the masses


Anyone who reads my blog will have observed that I’ve not posted anything for several months now. In addition to my moving country and trying to build a company in 2016, I’ve also been writing a book.

I’m delighted to now share that Learn F# is now available on Manning’s MEAP program – hopefully the book will be content complete within the next couple of months.

The book is designed specifically for C# and VB .NET developers who are already familiar with Visual Studio and the .NET ecosystem to get up to speed with F#. Half the book focuses on core language features, whilst the second half looks at practical use cases of F# in areas such as web programming, data access and interoperability.

The book doesn’t focus on theoretical aspects of functional programming – so no discussion of monads or category theory here – but rather attempts to explain to the reader the core fundamentals of functional programming (at least, in my opinion) and apply them in a practical sense. In this sense I think that the book doesn’t overlap too much with many of the F# books out there – it doesn’t give you a hardcore understanding of the mathematical fundamentals of FP, and relates many concepts to those that the reader will already be familiar with in C# etc.. – but it will give you confidence to use, explore and learn more about F# alongside what you already know.

I’d like to think it will appeal to those developers that are already on the .NET platform and want to see how they can utilise and benefit from F# within their “day-to-day” without having to throw away everything they’ve learned so far. So you’ll see how to perform data access more easily without resorting to Entity Framework, how to perform error handling in F# in a more sane manner, parsing data files, and creating web APIs, whilst using FP & F#-specific language features directly in solving those problems.

I’ll blog about my experiences of writing a book when it’s done – for now, I hope that this book is well received and a useful addition to the excellent learning materials already available in the F# world.

Hosting Suave in the Azure App Service

logo

In my previous post, I spoke about the deployment aspects of the Azure App Service, and how in conjunction with Kudu, F# and FAKE, we can utilise a SCM-based solution for deployment that can essentially follow the exact same build process as is performed locally.

In this post I want to discuss the process behind hosting Suave (or indeed any application that listens to HTTP traffic) in the Azure App Service.

What is Azure App Service?

The Azure App Service is a broad service that contains multiple “sub services”: –

  • Web apps
  • Logic apps
  • Mobile apps
  • API apps

We’re interested in Web Apps in the post. If you’ve used Azure before, and had an ASP .NET web application, it was an easy decision to pick the Azure App Service as the service to host your app. What’s not so well known – and I admit that until I spent some time looking into it I just assumed that it wasn’t possible – is that you can use the service to host any executable application within the IIS process, and have the app service simply act as a passthru, routing HTTP requests through to your application and back again.

Why would you want to do this though? Why not just use a Cloud Service or raw VM? I would direct you to my previous post on Azure services but in a nutshell, the app service provides a higher-level service than either of the others – think of it as IIS-as-a-service – with support for: –

  • SCM-based deployment e.g. GitHub, BitBucket etc.
  • Metrics and alerting services
  • Scale up application size on demand
  • Automatic load balancing with scale out on demand or based on metrics such as CPU
  • Turn-key authentication features
  • Slots – deploy different versions of code to test before flipping to live
  • A/B testing support
  • Web jobs

So, basically a lot of things that you’d need to manage yourself in a production web application all come out of the box. And the good thing is that you can get all of these features with e.g. a Suave web application as well – it’s not just for ASP .NET.

Creating an Azure Web App

To create a web app, simply log into the Azure portal, select New from the left hand side menu, choose Web and Mobile and finally Web App. Fill in the details, confirm, and you’ll end up with an empty website that you can browse to and receive a stock Azure Website page. So now we have an empty application, how do we put our code into the app?

Binding Suave to an Azure Web App

The first thing you’ll need to do is get your code into the Azure web app that you’ve just created. There are a number of ways that you can achieve this.

Firstly, you can use SCM-based deployment, which I detailed in my previous post. But a quicker way to go for a “one-off” deployment is simply to FTP in and copy the files across. To do this, in the Portal, navigate to your empty web application and hit the Get Publish Settings option from the menu bar of the web app pane. This will give you an xml file, inside of which are the FTP address and credentials. You can then FTP in and simply copy up your application into the wwwroot folder.

Note that you can also use HTTP as well as MS Web Deploy (either through the command line or Visual Studio), although I suspect that that would require making your Suave application appear as a web app through custom project GUIDs.

Configuring the Azure App Service

A standard Suave application (at least, all the examples I’ve seen) run as either .fsx scripts or executables. Indeed there are already a few examples of running Suave within an Azure website – and I should give credit to Scott Hanselman and Steffen Forkmann for getting the basic Suave example up and running here. The majority of what I’ve done from here is based on that work – the difference is that rather than hosting FAKE itself which runs a simple Suave application within an .fsx file, I’m not using FAKE as a host at all (although I do use FAKE for the build stage as per my previous blog post). Instead, all I’m doing is hosting an .NET executable that launches Suave.

So how do we do it? Bear in mind that the Azure App Service is essentially just IIS as a managed service. It’s actually rather straightforward once you know what’s required, which is simply to instruct IIS to redirect all traffic to our Suave application. How do we do that?

Adding a custom web.config

In addition to your standard executable app.config which contains all the config and binding redirects etc., you need a slimline web.config which is used by IIS to startup and then redirect traffic to your Suave application. It doesn’t contain much: –

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
  <system.webServer>
    <handlers>
      <remove name="httpplatformhandler" />
      <add name="httpplatformhandler" path="*" verb="*" modules="httpPlatformHandler" resourceType="Unspecified"/>
    </handlers>
    <httpPlatform stdoutLogEnabled="true" stdoutLogFile="suave.log" startupTimeLimit="20" processPath="%HOME%\site\wwwroot\SuaveHost.exe" arguments="%HTTP_PLATFORM_PORT%"/>
  </system.webServer>
</configuration>

The key parts to take away from this are: –

  1. Remove the standard HTTP Platform Handler and replace with another one.
  2. Specify to use SuaveHost.exe (my application name) as the application in the processPath attribute.
  3. Pass in an argument to the application – the internal port that traffic will come in on using the %HTTP_PLATFORM_PORT% variable. You can pass in multiple arguments here, just ensure they are space separated.

Handling arguments in Suave

Now that we have hooked our application into IIS, in Suave we simply run our application and take in the port number as an argument: –

[<EntryPoint>]
let main [| port |] =
    let config =
        { defaultConfig with
              bindings = [ HttpBinding.mk HTTP IPAddress.Loopback (uint16 port) ]
              listenTimeout = TimeSpan.FromMilliseconds 3000. }
    // rest of application

That’s it!

Managing Suave through Azure App Service

Now that you have your application up and running in Azure, what can you do? Well, you can log into the Azure portal and get some metrics of your website immediately as a configurable dashboard: –

Untitled.png

The charts are configurable, so you can select which metrics you’d like to show e.g. which HTTP codes etc. over what time period. We can also look at the process explorer – and sure enough, there’s our SuaveHost.exe application: –

Suave2

And we can even drill into the process: –

suave3

Conclusion

Of course, what I’ve shown you above is just scratching the surface of what you can do with Azure. It’s possible to do all the other things I mentioned at the start of this post, such as scale up the size of the web server, scale out to multiple instances, create multiple deployment slots etc., all from within the portal. Or perhaps you’d like to set up custom alerts based on any of the dashboard metrics over a certain period of time e.g. “> 50 HTTP 404s in the last 5 minutes” and send an email / hit an HTTP endpoint etc.? No problem – that’s supported out of the box.

It’s actually all incredibly easy and really allows you to simply focus on the work of developing an application and let Azure manage the infrastructural challenges. In fact, I can’t imagine self-hosting (or self-managing) any web-facing application when you have a service like this available. Hopefully I’ve shown though, that’s it’s not just the stock ASP .NET website that can be run through Azure web apps – we can host Suave as well without much effort at all.

The source code that was used for this post is available here.

 

 

Deploying Azure web applications with FAKE


The Azure App Service is a great service that makes hosting web-facing applications extremely easy, with support for many value adds out of the box e.g. scale out, A/B testing and authentication are all included. I’ve recently been looking at how you can use this service within the context of some F# frameworks and libraries e.g. Suave. I’ll blog about the Suave side of things in another post – there’s a lot to it – but one of the other parts I wanted to mention was that FAKE now has support for Kudu, the Azure App Service SCM deployment engine.

What is Kudu?

One of the features that App Service offers is a multitude of deployment options, including FTP, HTTP and also source control web hooks. The latter supports a number of providers, including GitHub, BitBucket, VSTS and even a locally hosted git repository. The App Service listens to push events on a specific branch, downloads the source code onto the web server (into a sandboxed location) and then copies it into the website proper. The latter stage – the copy – is the one of most interest. Essentially, the app service runs a batch file which can do whatever is needed to do a build and deploy. For a .NET application, this typically includes: –

  1. Perform an MSBuild of the application.
  2. Copy the outputs to the “staging” directory on the web site.
  3. Run KuduSync to deploy to the actually web application folder.

KuduSync itself essentially does a few things: –

  • Does a diff of the current files to deploy from the previous deployment.
  • Removes any obsolete files from the app.
  • Copies over any new / updated files from the staging directory to the app.
  • Makes a list of the deployed files for comparison the next time it runs.

The Azure CLI extensions come with some commands to “pre-generate” a batch file for specific, common use cases e.g. ASP .NET application etc.., but you’ll often need to do something more than just that – and this means getting your hands dirty with a kudu script.

A Sample Kudu script

So here’s a standard Kudu build script (which I’ve actually minimised as much as possible) which deploys some raw web assets (HTML, JS etc.), builds a .NET application and deploys a web job : –

:: Restore NuGet packages
.paket\paket.bootstrapper.exe
.paket\paket.exe restore

:: Copy static site content over - note the "excludes.txt" which contains file types to ignore....
xcopy src\webhost "%DEPLOYMENT_TEMP%\" /Y /E /Q /EXCLUDE:excludes.txt
IF !ERRORLEVEL! NEQ 0 goto error

:: Deploy an F# script as a continuously running Web Job
xcopy src\Sample.fsx "%DEPLOYMENT_TEMP%\app_data\jobs\continuous\Sample\" /Y
IF !ERRORLEVEL! NEQ 0 goto error

:: Build to the temporary path
cd "%DEPLOYMENT_SOURCE%"
call :ExecuteCmd "%MSBUILD_PATH%" /m /t:Build /p:Configuration=Release;OutputPath="%DEPLOYMENT_TEMP%";UseSharedCompilation=false %SCM_BUILD_ARGS% /v:m
IF !ERRORLEVEL! NEQ 0 goto error
cd ..

:: KuduSync
call :ExecuteCmd "%KUDU_SYNC_CMD%" -v 50 -f "%DEPLOYMENT_TEMP%" -t "%DEPLOYMENT_TARGET%" -n "%NEXT_MANIFEST_PATH%" -p "%PREVIOUS_MANIFEST_PATH%" -i ".git;.hg;.deployment;deploy.cmd"
IF !ERRORLEVEL! NEQ 0 goto error

There’s actually a whole host of things here: –

  1. First I’m pulling down my Nuget dependencies with Paket (this could just as easily be Nuget.exe) before moving onto our main build.
  2. xcopy across the website assets. There are some files I don’t want, so I also have to add an “excludes.txt” file which contains file types I don’t want as an argument to xcopy. This was a real pain to figure out and to use the correct arguments to xcopy.
  3. Copy across an .fsx file as a web job. I needed to figure out how web jobs are stored on the app service in order to know the path to build up, of course.
  4. Do some jumping around of folders before doing an MSBuild of my application.
  5. Call Kudu Sync to do the final deploy, passing in the set of folder locations needed for the tool.

Using batch files for a built pipeline probably isn’t the best way to go. The problem with this is that managing a set of build sets quickly becomes a pain with a batch file – you have GOTOs everywhere, labels etc., you can’t do complex control flow etc. etc. Imagine you wanted to now run unit tests, and if that failed, do some set of tasks, but if it passed, do something else etc. etc. – it quickly becomes a nightmare.

Enter FAKE

On the other hand, FAKE is an excellent library and DSL designed to act manage a build pipeline. Not only does it have loads of helpers for e.g. file system access, config file rewriting, environment variables and MSBuild etc. but it allows us to define build pipelines with dependencies – even conditional stages. Finally, because FAKE is just F# and runs on the full .NET framework, you can always break out and just run any .NET code you want directly from within a FAKE script. With FAKE, you can have a single build script for e.g. local builds, CI builds and also now supports Kudu deployment builds through the newly-added Kudu module in FAKE. Let’s see what the above build script looks like in FAKE: –

You can see here that there are several distinct build steps, which are composed together as dependencies on one another at the very end using the ==> operator. Note that the code above is actually just F# although we’re using a specific DSL with custom operators to set up a “build chain”. So if any of the stages fail, the whole build will fail and we’ll be presented with a summary log (which we can see directly in the Azure portal) of the results of the build. Notice also the lack of environment variables etc. – the Kudu helper module takes care of all of that for us – whilst we don’t need gotos anymore because FAKE handles the build pipeline.

Now our Kudu script is much simpler, because we’re delegating control of the main build orchestration to a language better able to reason about and define program flow: –

:: Restore NuGet packages
.paket\paket.bootstrapper.exe
.paket\paket.exe restore

:: Start main build script
packages\FAKE\tools\FAKE.exe build.fsx

Conclusion

Kudu and Azure App Service are great tools. By plugging FAKE into the mix, we get both a succinct and easy to use scripting experience with the power of the .NET framework and a fantastic language like F# as well.

Working with running totals in F#


This post is an expanded version of February’s F# Gazette.

A common issue that developers that come from an OO background struggle with is how to effectively create accumulations over sequences of data without resorting to mutable state in an effective and succinct style.

Let’s start with a fictional set of results for a sports team, and a function to convert from the outcome to a points earned: –

let results = [ "W"; "L"; "L"; "D"; "W"; "W"; "L"; "D" ]
let convertToPoints = function | "W" -> 3 | "D" -> 1 | _ -> 0

We would like to graphically show the form of the team by accumulating the points earned over the series: –

newplot.png

So the question is really, how can we easily go from a series of singular results to one that maintains a running total of all the results previously as well? Let’s look at some alternatives that you might come up with.

Managing accumulation with mutation

The implementation that you might initially consider doing is one which uses a mutable variable to track the running total.

This code is not terrible – it’s not that hard to infer its meaning, particuarly if you come from an imperative background. And although this code doesn’t use the mutable keyword anywhere, it’s still of course mutating the output collection through the call to .Add(). So because we’re functional programmers and we’ve been told that mutation is the spawn of satan, we have to find another way around this. Let’s try recursion – that’s usually a good trick.

Managing accumulation with recursion

Ehhhhh. I’m not fond of this solution. I find it hard to reason about recursion sometimes, and this is a good example – I need to think about the “end case” of the recursive function; the code doesn’t look particularly nice. We have to manually manage the output list, and remember to reverse it at the end (because in F#, it’s more efficient to add to the head than tail of a list) etc. etc. But at least we got rid of that mutable variable, which is all important, right?

Improving readability through higher order functions

This sort of dogmatic approach is guaranteed to turn people off to FP. Yes, mutation is best avoided (although in a case like this I would argue it’s not that dangerous as the mutable state is short lived – just a few lines of code) but I would argue that in this case, the cost of the recursive version in terms of readability is too high. So we can improve this by using the fold() function that exists in the collection libraries (I’ve put type annotations in for illustrative purposes, but they’re not necessary).

fold() is nice! It generalises the problem of managing state for us across iterations of a collection, so we don’t have to pattern match or manually pass items into a recursive function. This code is certainly less than the recursive version, and once you know how fold works, it’s much easier to reason about. So we start with a list of [ 0 ], and then manually calculate the next item in the list by manually adding the head of the list with the next result we’re passed by fold(), and then pushing that answer onto the head of the list: –

[0], W - add 3 points.
[3; 0], L - add 0 points.
[3; 3; 0], L - add 0 points.
[3; 3; 3; 0], D - add 1 points.
[4; 3; 3; 3; 0], W - add 3 points.
[7; 4; 3; 3; 3; 0], W - add 3 points.
[10; 7; 4; 3; 3; 3; 0], L - add 0 points.
[10; 10; 7; 4; 3; 3; 3; 0], D - add 1 points.
val it : int list = [0; 3; 3; 3; 4; 7; 10; 10; 11]

However, we’re still having to manually maintain the output list ourselves and then reverse it with that ugly List.rev at the end.

Using higher-level folds

So it turns out that there’s a derivative, constrained version of fold() that is designed for exactly what we need – generating output lists based on accumulations – and it’s called scan(). Scan works exactly like fold, except it generates a list automatically based on the result of each iteration. Here’s the code: –

That’s it. In this version, we don’t need to manage state across calls, just like fold – but we also don’t need to generate a list either – that’s done by scan as well! In addition, we don’t need to reverse the list – again, this is taken care for us. The same sort of logging as above now yields the following: –

Current form is 0, result is W - add 3 points.
Current form is 3, result is L - add 0 points.
Current form is 3, result is L - add 0 points.
Current form is 3, result is D - add 1 points.
Current form is 4, result is W - add 3 points.
Current form is 7, result is W - add 3 points.
Current form is 10, result is L - add 0 points.
Current form is 10, result is D - add 1 points.
val it : int list = [0; 3; 3; 3; 4; 7; 10; 10; 11]

Notice that this time, we never actually see the output list within each iteration – instead we’re just given the current running total.

Taking it even further

If you want to be one of the cool kids you can reorder the transformations so that you can make the code even more succinct: –

And this can be simplified further still by taking advantage of curried functions as: –

Conclusion

I daresay that you might not necessarily want to go as far as this last version, depending on your experience with currying and composition in F#, but you can see that by delegating the “boilerplate” of iteration with state, and outputting a running total, we can reduce even the mutating version to just: –

  • our original function to go from W | D | L -> points
  • two higher order functions that we compose together
  • an addition operator
  • a starting state of 0

No mutation, for loops, recursion or list manipulation.

And just to come full circle – I believe that many of the accumulation functions within the collections library in FSharp.Core (include fold()) actually use mutable variables!

Visual Studio Team Services and FAKE

4

What is VSTS?

Visual Studio Team Services (VSTS) is Microsoft’s cloud-based source control / CI build / work item tracking system (with a nice visual task board). It’s a platform that is evolving relatively quickly, with lots of new features being added all the time. It also comes with a number of plans including a free plan which entitles you to an unlimited number of private repositories with up to 5 users (and MSDN users do not count), plus a fixed number of hours for centralized builds.

The catch is that there is (at least, I couldn’t find!) any way to host a completely public repository with unlimited users – obviously a big problem if you want to have an open source project with lots of contributors. But for a private team, it might be a good model for you. You can also use the CI build facilities of VSTS with GitHub repositories – so in this sense you can treat it as a competitor to something like AppVeyor perhaps.

Contrary to common opinion, VSTS is completely compatible with Git as a source control repository. Yes, you can opt to use TFS as a source control model, but (in my opinion) you’d have to be crazy to do this or have a team that are really used to the TFS way of working – I find Git to be a much, much more effective source control system.

Why FAKE?

I wanted to try and see whether it was possible to get VSTS working with FAKE.

One of the best things about FAKE is, in addition to the ease of use, flexibility and power that you get by creating build tasks directly within F# (and therefore with the full .NET behind it), is that because you are not dependent on hosting a bespoke build server with custom tasks etc., such as Team City, it’s extremely rare (hopefully never) that you have a situation where a build runs locally but fails to run on the build server.

Unlike relying on e.g. Team City to orchestrate your build, you delegate the entire CI build steps to FAKE – MSBuild, Unit Tests, configuration file rewriting etc. etc.. So if the build fails, you don’t have to e.g. log into the Team City box, check some log files etc. – your FAKE script does all the heavy lifting so you can run the exact same steps locally.

Putting it all together

So my goal was to write a simple FAKE build script which pulled down any dependencies, performed a build and ran unit tests – all integrated within VSTS. As it turns out, it wasn’t very difficult at all.

Firstly, we hook up the build to source control. In our case, it’s the Git repository of the Team Project, so works straight out of the box, but you can point to another Git repository e.g. GitHub as well. You can also select multiple branches. We then set a trigger to occur on each commit.

1.pngSecondly, we have to set up the actual build steps. As we’re delegating to FAKE to perform the whole build + tests, we want to use as few “custom” VSTS tasks as possible. In fact, we actually only need two steps.

  1. Some way to download Paket or Nuget, and then initiate the FAKE build.
  2. Some way of tieing in the results of XUnit that we’re going to run in FAKE into the VSTS test reports.

Unlike old-school TFS etc., VSTS now has an extensible and rich set of build tasks that you can chain together – no need for Workflow Foundation etc. at all here: –

3.png

Notice the “Batch Script” task above – perfect for our needs, as we can use it to perform our first build task to download Paket and then start FAKE.

We can now see what the FAKE script does – this is probably nothing more than what you would do normally with FAKE anyway to clean the file system, perform a build and then run unit tests: –

Notice that when we run unit tests, we also emit the results as an XML file. This is where the second build task comes into VSTS (Publish Test Results), which is used to parse the XML results and tie into VSTS’ build report.

2.png

So when we next perform a commit, we’ll see a build report that looks something like this: –

4.png

Notice that the chart on the right shows that I’ve run 2 unit tests that were successful – this is the second build task parsing the XUnit output. Of course we can also drill into the different stages if needed to see the output: –

5.png

Conclusion

This post isn’t as much about either VSTS or FAKE features per se, as it is about illustrating how both VSTS and FAKE are flexible enough that we can plug the two together. What’s great about this approach is that we’re not locked in to VSTS as a build system – we’re just using FAKE and running it centrally – but if we’re using VSTS we also can benefit from the integration that VSTS offers with e.g. Visual Studio and the build system e.g. creating work items, associating commits to work items and viewing from VS etc. etc. – whilst still using FAKE for our build.