In my previous posts, I discussed the use of the Service Fabric (SF) actor framework (which is loosely based on Orleans) and F#, and how we can use FP features within an actor model, even one designed for OO languages.
Exposing Services with Service Fabric
Ironically, the actor framework with SF is one of its more complex features – you can use SF to host literally any .NET code you want. There are a number of features within SF designed to allows you to rapidly host scalable systems, with support for state replication out-of-the-box. In this post, I want to illustrate the steps needed to host the F#, FP-first web server Suave in Service Fabric. It turns out that there’s really not much code needed at all.
- We create an F# executable that is compatible with Service Fabric.
- We create a service that inherits from the StatelessService class (we’ll discuss Stateful Services in another post).
- We override the CreateCommunicationListener method. This is important – essentially this method’s responsibility is to create an object that can handle incoming traffic from external sources. We also make a note of the port that Suave will be running on.
- We configure an endpoint in the Service Fabric configuration for that same port. This tells SF to allow inbound traffic in. This is roughly analogous to opening up an endpoint in Cloud Services. This is something you should have also specified when creating the cluster itself in Azure (if not, you’ll need to manually configure the load balancer to allow traffic through).
- In our Main program, we register the service with SF.
The key part is (3), where we implement the functionality that should get called to handle incoming requests. It’s pretty basic really: –
So CreateCommunicationListener() expects an instance of ICommunicationListener that will create the web server for us. Luckily with F#’s object initializers we don’t even have to declare a formal type – we can simply create the object on the fly. As you can see, all it does is start up Suave using default settings. You might elect to supply the port that it starts on from the endpoint configuration in Service Fabric – this is done in the Initialize method, and is included in the full sample.
Once done, you can configure the scalability of the service in config – if you want three instances, just set the instance attribute to 3 in the ApplicationManifest file of your hosting Service Fabric application. If you want it on every node, you set the attribute to -1 (because we all know that -1 is the universal standard for “absence of a number” – we don’t need no option types ;-). Note that running this locally with multiple instances won’t work, since they all try to run on the same port, but in the real world it’d work fine I’m sure.
As an aside, if you want any arbitrary service that doesn’t necessarily need incoming traffic e.g. something subscribing to service bus or writing to a DB, you don’t have to implement anything regarding ICommunicationListener. There’s simply a RunAsync() method that you can put any code inside that you want.
So, there you have Suave in Service Fabric with a minimal amount of code. For this you’ll get an auto-load balancing, scalable and automatically healing service. In my next post, I’ll demonstrate StatefulServices and how we can use them to automatically manage state across a cluster of services.
In Part 1, I provided an overview of what Service Fabric (SF) is, and provided some step-by-step guidance on how to get up and running with the Service Fabric local installation. In this post, I want to move from the infrastructure to the code, and show how we can use F# with an Actor model designed primarily for C# and VB .NET, whilst still retaining an idiomatic F# feel where possible.
All code for the full sample used as the basis for this series is available here.
Actors in Service Fabric
Firstly, I’ll show you some elided examples of how we modelled some features of my cat as an Actor in Service Fabric. Every cat has some state which is affected by actions it does, and needs to be persisted across calls. In Service Fabric, we call this a “stateful” actor. After every “state-updating” action (in SF terms, this equates to a method call on the actor), SF will automatically persist your state back to disk and automatically replicate to other nodes in the SF cluster (typically at least two others); if your primary node goes down, one of the secondaries will immediately take over and the failed node will be silently replaced in the background. You can also have so-called “read only” actions, which do not modify state but typically return some payload to the caller. You can typically think of these as “getter” methods / properties on a class. You’ll normally have a mix of both state-mutating and read-only methods on a given actor.
Implementing Stateful Actors in F#
Every stateful Actor in SF inherits from the type Actor<T>, where T is the state that needs to be persisted. It shows up as a member property on the actor, State. Service Fabric will automatically create one of these when starting every given actor, and silently persist / load it across calls etc.
We’ll start by modelling the state on the Actor by default with a standard OO class in F# – see below. Notice the DataContract and DataMember attributes – these are used by the persistence layer of SF to de/re-hydrate state to an Actor. Personally I’m not particularly fond of these attributes – there are plenty of serialization frameworks out there that seem to work just fine without decorating every single property, so why are we stuck with this old-school approach? Perhaps there’s a way to replace the serialization in SF – I haven’t tried yet.
Anyway, here’s an example method on Cat, called Jump(). It takes in a destination of where the cat is jumping to, and depending on the destination, this affects the cat – and the owner’s – Happiness (in a more fully featured model, the owner themselves would probably be an actor with their own state). The cat will also work up an appetite by Jumping(). Hunger can be alleviated by Feeding() the cat.
On the one hand, F# works nicely with interfaces – we still don’t have to specify types, as they are inferred from the interface we’re implementing. However, this sample is still somewhat unsatisfactory to me as an F#-first person: I’m used to creating copies of data from other data, not mutating it. I also don’t like this approach of modifying state in several places arbitrarily – I feel uneasy when seeing code like this. It seems very statement oriented, with side effects everywhere – something I struggle to reason about easily. There must be something better!
Use immutable data structures on Actors
As it turns out, there is. Notice that up until now we’ve basically written everything in an OO style, using standard C#/ VB constructs like classes etc. – we’ve not used any F# types. We can actually use many F# features without too much fuss, and they can quickly help us out in our quest to getting back to sane and easy-to-reason-about code.
Firstly, we can change the way we model our state from a class to an F# record. This actually works without any problem, once you do the same WCF-style attribute decoration, and add the [<CLIMutable>] attribute – this is necessary as although Records boil down to standard Classes, by default there’s no public setter on any properties, so SF can’t rehydrate state by default. We can also add in other F#-only features, like units of measure, if we want – as these are a compile-only feature, there’s no issue with serialization of them.
On their own, using records within SF only works up to a point – we’re forced to make copies of state, rather than mutating the single attributes of the State member multiple times, which is a good thing. However, it still looks undesirable – we’re now just mutating the State member property on the Actor instead! Plus it’s not clear when and where we should replace the contents of the State member within the method – every time? Once at the end of the method call? Something in between?
Adapting functional patterns into Actors
Let’s take a step back and think about the two types of methods I mentioned earlier on – state-updating and read-only calls. The former intends to do some processing, and update the State of the actor. The latter typically reads from the State and returns some data to the caller (I’m setting aside things like calling external dependencies etc. which for simplifies’ sake we can ignore – plus it really doesn’t affect us here as we would partially apply our functions with dependencies). We can formally specify such actions and implement them with something like this: –
Notice how now our functions are much simpler – Jump is made up of a single expression that generates the new State of the Actor, based on the input state and distance – we’re no longer mutating state multiple times, or even once. And because State is an immutable record, it’s impossible to modify the supplied input State ever.
Plugging pure functions into Actors
Now that we’ve formalised how we see our actor methods working, we can re-write our earlier code from the anything-goes, mutate-everywhere C# style to one that is easier to test, easier to reason about and more idiomatic from an FP, F# point of view. You’ll notice that the implementation code above is back in a module – so how do we plug this into our OO Actor model?
There are a few ways, but the easiest one is with the help of a couple of shim functions that tightly control the mutation of the Actor State, whilst delegating control to our purely functional code for business logic. Our core code is kept free from worrying about the mutation of state and is performed in a consistent manner; our SF Actor model simply delegates to them.
A word on Read-Only Service Fabric methods
Another point worth mentioning are Read Only methods in Service Fabric. These are methods that you, as the developer, tell the SF runtime “I will never amend state in this method – don’t try to persist state at the end of the call”. This is achieved in SF simply by placing the [<Readonly>] attribute on the method. I don’t like this much for two reasons. Firstly, the attribute differs from the System.ComponentModel [<ReadOnly>] attribute simply by virtue of the fact that it has a different casing on one of the characters in the type. Use the wrong one accidentally and things will quickly go pop with your actor (believe me – I did it during the creation of the code referenced in this post; the error that you get isn’t helpful either). The other, more dangerous issue is that there is no compile time safety around the use of the [<Readonly>] attribute. If you decide to start changing state in one of these calls – tough. You won’t get any support from the compiler, nor from the runtime. Your method simply won’t update state and you’ll be left wondering why your application isn’t behaving correctly.
With the “adapt to a functional style” approach, whilst we don’t eliminate the issue completely – you still have to decorate the methods appropriately – we at least get compile-time checking on read-only functions, because they don’t allow us to return state; you therefore can’t accidentally modify the state of an actor. In addition, because we’re now using records, which are themselves immutable, it’s impossible for us to modify the state that was supplied to us.
For a simple example like the one supplied, one could argue that the extra delegation and modules etc. complicates matters compared to e.g. C# / OO. However, once you start writing even mildly complicate business logic, it quickly becomes a tiny cost compared to the simplification you benefit from through immutability, records etc.. as well as the usual other benefits of F#.
Taking it further
You can take this approach even further – in other actor frameworks, rather than adopting the “method-per-action” approach, a more functional approach is to have a single message which is itself a discriminated union containing all the different messages ; we then pattern match on this in order to process the message appropriately. We can apply this sort of pattern for updating-state messages, although it isn’t exactly idiomatic SF actor code (I’ve supplied an example in the source code).
Another alternative might be to create a custom Computation Expression (perhaps similar to the Writer monad that Tomas Petricek blogged about many moons ago) in order to make this modification to state even more succinct. Perhaps someone could write one ;-)
We’ve seen how we can marry up some features inherent to the F# type system in order to enforce a cleaner way of reasoning about the code that our actors have to implement, through a couple of simple function signatures and some simple adaptors. We’ve also seen how F#, and typical FP paradigms, can be used in an reliable and distributable framework designed for a mutable-first OO consumer.
In part three, I want to illustrate how we can quickly and easily host arbitrary services on top of Service Fabric in F# for just about any code you might want to write, and how we can easily scale it to large volume.
This post is the first part of a brief overview of Service Fabric and how we can model Service Fabric Actors in F#. Part 1 will cover the details of how to get up and running in SF, whilst Part 2 will look at the challenges and solutions to modelling stateful actors in a OO-based framework within F#.
What is Service Fabric?
Service Fabric is a new service on Azure (currently in preview at the time of writing) which is designed to support reliable, scalable (at “hyper scale”) and maintainable distributed applications and services – with automatic support for things like replication of state across nodes, automatic failover & recovery and multi tenanting services on the same instances. It supports (currently) both stateful and stateless micro-services and actor model architectures (more on this shortly). The good thing about Service Fabric (SF) from a risk/reward point of view is that it’s not a new technology – it actually underpins a lot of existing Azure services themselves such as Azure SQL, DocDB and even Cortana, so when Microsoft says it’s a reliable and scalable technology, they’ve been using it for a while now with a lot of big services on Azure. The other nice thing is that whilst it’s still private preview for running in Azure, you can get access to a locally running SF here. This isn’t an emulator like with Azure Storage – it’s apparently the “full” SF, just running locally. Nice.
Actors on Service Fabric
As mentioned, SF supports an Actor model in both stateful and stateless modes. It’s actually based on the Orleans codebase, although I was pleasantly surprised to see that there’s actually no C# code-generation whatsoever in SF – the only bit that’s auto-generated are some XML configuration files which I suspect will be pretty much boilerplate for most people and rarely change.
Why would you want to try SF out? Well, simply put, it allows you to focus on the code you write, as opposed to the infrastructure side of things. You spin up an SF cluster (or run the local version), deploy your code to it, and off you go. This is right up my alley, as someone who likes to focus on creating solutions and sometimes has little patience for messing around with infrastructural challenges or difficulties that prevent me from doing what I’m best at.
Getting up and running with Service Fabric
I’ve been using Service Fabric for a little while now, and spent a couple of hours getting it up and running in F#. As it turns out, it’s not too much hassle to do aside from a few oddities, which I’ll outline here: –
- Download and install VS2015. Community edition should be fine here. You’ll also need WIndows 8 or above.
- Download and install the SDK.
- Create a new Service Fabric solution and an Stateful Actor service. This will give you four projects: –
- A SF hosting project. This has no code in it, but essentially just the manifest for what services get deployed and how to host them.
- An Actors project. This holds your actor classes and any associated code; it also serves as a bootstrapper that deploys the appropriate services into SF; as such, it’s actually an executable program which does this during Main(). It also holds a couple XML configuration files that describe the name of the package and each of the services that will be hosted.
- An Interfaces project. This holds your actor interfaces. I suspect that this project could just as easily be collapsed into the actors one, although I suppose for binary compatibility you might want to keep the two separate so you can update the implementations without redeploying the interfaces to clients.
- A console test project. This just illustrates how to connect to the Service Fabric and create actors. In the F# world these projects serve zero purpose since we can just create a script file to interact with our code, so I deleted this immediately.
- Convert to Paket (optional). If you use Paket rather than Nuget for dependency management, change over now. The convert-from-nuget works first time; you’ll end up with a simplified packages file of just a single dependency (Microsoft.ServiceFabric.Actors), plus you’ll get all the other benefits over Nuget.
- Create F# project equivalents. The two core projects, the Actors and Interfaces projects, can simply be recreated as an F# Console App and Class Library respectively. The only trick is to copy across the PackageRoot configuration folder from the C# Actors project to the equivalent F# one. Once you’ve done this, you can essentially disregard the C# projects.
- Configure the F# projects. I set both projects to 4.5.1 (as this is what the C# ones default to) – I briefly tried (and failed) to get them up and running in 4.5.2 or 4.6. Also, make sure that both projects target x64 rather than AnyCPU. This is more than just changing the target in the project settings – you must create a Configuration (via Configuration Manager) called x64!
- Create an interface. This is pretty simple – each actor is represented by an interface that inherits from IActor (a marker interface). Make sure that all arguments in all methods all have explicit names! If you don’t do this, your actors will crash on initialisation.
- Create the implementation. Here’s an example of a Cat actor interface and implementation.
- Update the Hosting project. Reference the implementation from the Hosting project and update the configuration appropriately.
Luckily, I’ve done all of this in a sample project available here.
Running your project
Once you’ve done all this, you can simply hit F5 (or Publish from the Host project) and watch as your code is launched into the Fabric via the UI.
I’m looking forward to talking more about the coding side of this in my next post, where we can see how code that is inherently mutable doesn’t always fit idiomatically into F#, and how we can take advantage of F#’s ability to mix and match OO and FP styles to improve readability and understanding of our code without too much effort.
A short interlude from my little “solving games in F#” series today.
I’ve recently moved house and was trying to figure out how long it would take to get to work. I started thinking about this problem in terms of minutes until I realised that I really wanted to calculate it in a much more important measure of time – Dream Theater songs. This is a new unit of time I have recently devised. Each unit represents the average time to listen to a single Dream Theater song. So a journey to the shops might be 2 Dream Theaters, whilst going to work might be a few more of them (If you’ve ever listened to Dream Theater, you’ll know that a typical piece might last anywhere from 3 or 4 minutes up to some 25 minutes epics).
I started by defining some common units augmenting the inbuilt units supplied with F#, along with some simple conversions: –
I didn’t want to just guess what a DT really is, so of course it’s F# to the rescue. First thing, let’s calculate the average duration of a single Dream Theater song. Where do we get that data? Wikipedia, of course. A short while later, and we have the DT wikipedia page saved locally, which contains a nice HTML table with all songs, and their lengths. After a tiny bit of cleaning up the HTML to remove some extraenous elements, we can now do use the HTML type provider in FSharp.Data to do something like this: –
So, now that we’ve determined that the average length of a Dream Theater song is 8.44 minutes, let’s calculate my journey to work in DTs:
On average I can listen to just over 5 Dream Theater epics on my way into work. Or in other words, my door-to-door journey from home to work takes around 5.33 DTs :)
Continuing on with my gaming-in-F# post, this week’s post is derived from this challenge. The initial challenge is, for each node, to determine the “efficiency” of a node within a network, that is, to calculate the maximum number of hops it takes to reach all edges of the graph starting from that node. So given the graph below, starting from node 3, it’s 2 hops (in this case, to all edges i.e. 1, 5, 6 and 8). However, if you were to start at, say, node 2, it’d be 3 hops (1 hop to node 1, but 3 hops to nodes 5, 6 and 8). Therefore, node 3 is considered a more “efficient” node. Secondly, you should determine the most efficient node in the whole network – the objective being to calculate the most effective starting spot in a graph from which to reach all other nodes in the entire network.
Disclaimer: I should mention that, whilst I’ve been writing coding for a number of years, the degree I took was not heavily computer-science related – so things like big O notation, complexity theory etc. – all these are things I’ve learned over the year but never formally studied. So some of things below may seem second nature to you if you come from more of a maths background!
I identified three main elements in solving this challenge.
Representing the data
Relations are provided as as a simple int * int tuple i.e. (1,3) means that there’s a (bi-directional) connection between nodes 1 and 3. So I build a simple Map<int, int >, which is essentially a lookup that says “for a given node, give me the list of connected nodes”. Note that I decided not to use a “proper” representation of the graph here – an idiomatic way might have been a discriminated union with Node and Leaf etc. etc… – a simple Map proved enough for me.
Implementing the algorithm
Firstly, implement the logic to calculate the maximum distance to all edges, for every starting point in the graph. I solved this with (allegedly) what is essentially a recursive depth-first search. In other words, navigate from the starting point as far outwards as you can; then, walk back in until you find another branch, and go out there. Once you start walking back in, start counting how many steps it is until you reach a branch point. Once you have calculated the distance all of branches, take the largest one. Repeat this until you have exhausted all points in the graph and walked back to the starting point.
It should be now a simple task to simply apply this algorithm to all nodes in the graph, and then take the smallest one – this is the most efficient point in the graph from where to start.
Note that this wasn’t my first solution! The biggest challenge came when one of the exercises in the website provided a large set of connections in the graph – say, 30,000. At this point, my algorithm simply didn’t perform, so I had to look at some ways to improve performance. I tried a load of different things, each of which yielded some performance improvement, but not enough: –
- Moving from Sequences to Arrays. Sequences are flexible and sometimes very useful, but Arrays generally will outperform it for maps and filters etc., particularly if you are repeating an operation over the same sequence many times (although there is Seq.cache).
- Added state tracking. For each starting point, I would record the efficiency, and then provide that number to the next starting point. Whenever I encountered a graph that had a size at least equal to the score of the “most efficient node” found so far, I would immediately stop looking at that node and backtrack all the way out. This provided a good boost in performance, but not enough.
- I also experimented with either minor algorithmic improvements, such as prematurely exiting a particular route of the graph if we identified any single route that exceeded the current best size rather than iterating all children nodes and evaluating them together.
None of these solutions gave an optimal solution – all they did was increase the complexity of the solution at the cost of moderate performance gains. I realised that there must be another approach that I was missing that would provide the solution. Eventually I realised how I could probably largely improve efficiency – because when you start from two separate nodes, there’s usually a large amount of repeated traversals across them both. Take the above graph – you can view the network efficiency of e.g. node 3 as either described above, or as (the highest efficiency of all adjacent nodes for that subset of the graph) + 1.
In the image below, we know that Node 3 has an efficiency of 2 because the efficiency of Node 4 is 1, Node 7 is 1 and Node 2 is 1. Take the maximum of these (1), add 1, and we get 2.
So, given this, why not simply cache the results for every backtrack score from each pass? We can modify our above traversal code with some memoization backed by a simple Dictionary (apologies for using a mutable Dictionary – you could probably use an immutable Map if you wanted, although performance would probably suffer a bit) and then before making any outward facing movement in the graph, we check if that movement has already been made – if so, we can just use that result. This is why the final algorithm counts inwards from the edges rather than outwards – in order to allow caching of distances.
You can see from the logging messages above that although it should take 7 steps to calculate the efficiency of any given starting node, it only takes two calculations + one cache hit when calculating efficiency the second time, and two calcs + three cache hits the third time. Notice also the simplicity of the caching layer – types are (as usual) inferred by the compiler, and automatic generalization is used here too (e.g. notice that the Dictionary never has type arguments supplied) – it just works.
In terms of performance, you can observe the effect that the caching has for varying network sizes below. Notice that the graph is scaled logarithmically – for more than around 20 relationships, the cost of not caching becomes enormous as a single cache hit can save literally thousands of steps walking the graph.
I found this to be a stimulating challenge to solve. It wasn’t necessarily because of the challenge of some specific domain solving issue but rather one regarding optimising a solution in a specific manner i.e. caching. What I particularly liked was that F# allowed us to retain the overall feel of the algorithm, but add in the caching layer very easily. In addition, adding in caching allowed me to simplify the overall algorithm – I didn’t need to worry about making four or five small, specific improvements – instead, a single optimisation allowed me to combine it with a simpler algorithm, yet still get a massive performance boost.
After a (long) hiatus from posting here I’ve decided to finally start up again. This week: F# and Active Patterns.
Game Logic in F#
I came across https://www.codingame.com/ recently – a great website that essentially has a set of “game challenges”. Your task for most of these games is to write the “message pump” (any Win32 developers will know what I’m talking about here) or “loop”. You take in a set of values that represent the game state, and then return a set of commands to affect the game. Because this sits on top of standard in / out, there are many languages supported, including F# :-)
One of the things I like about these games is that many of the challenges show different aspects of domain modelling and problem solving using specific language features of F#. Active Patterns is just one such feature. It allows us to provide abstractions over arbitrary values and types, which in the context of games means that we can more easily reason about what is happening in the game.
Defining problems as a function
Let’s take one of the games as an example from the list of games: Chasm. In short, you have a bike driving on a bridge, which has a gap in it. You have to tell the bike: –
- What speed to drive at
- When to jump
- When to slow down after jumping the gap
You can think of this as a pure function, which takes in the details of the bridge (length, gap size and landing size) as well as the current state of the bike (speed, position), and returns a single command, one of whether to slow down, speed up, jump or do nothing. Here’s a simple rules engine that determines the logic we should apply: –
- If the bike is on the runway and going too slow then speed up
- If the bike is on the runway and going too fast then slow down
- If the bike is on the runway and the correct speed then do nothing
- If the bike is in flight then do nothing
- If the bike is just before the gap then jump
- If the bike is after the gap then slow down
What I like about this is that we can break this problem down into a set of smaller problems which can then be composed together to solve the bigger problem at hand. This is classic divide-and-conquer stuff (but without any SOLID fluff getting in the way ;-)).
Active Patterns to the rescue
We can first create a couple of active patterns to gain a better understanding of the state of the bike in relationship to the bridge. Firstly, where is the bike, and secondly what speed is it going at: –
By doing this, we abstract away the implementation of “how” to reason about the speed of the bike, or where it is, into a few simple cases e.g. The bike is going too fast, the bike is approaching the gap etc. etc. In fact, having made those two patterns, we can now build our rules engine just by matching the bike’s state over both of them together: –
Notice how there’s a close affinity between the rules I wrote originally and the code above. We don’t need to think about “how” we’ve determined that we’re on the runway – it’s an artifact of the pattern match over the bike state (and the bridge state which I’ve omitted here).
All that’s needed is to write a simple parser for the input states into our record type and push that into a while loop that blocks until it receives the next game states, and returns the desired next command.
Dealing with a complex set of states can be difficult to reason about. Using Active Patterns is a great way to quickly and easily decompose problems into manageable chunks that can then be utilised together to build up ever higher levels of abstraction upon which you can easily reason about.
I had a couple of evenings free this week so decided to see if I could implement the Enigma machine, used during WW2 by the Nazis (and famously decrypted by the Polish, French and ultimately the British in Bletchley Park via some of the first programmable computers) in F#.
An overview of Enigma
The initial work around doing this involved gaining an understanding of how it worked. I’d actually already visited Bletchley Park a few years back as well as read up on the Enigma machine anyway, so had a limited understanding of how it worked. However, actually implementing it in code taught me quite a few things about it that I didn’t know!
At a high level, you can think of the Enigma as a machine that performs a number of substitution cyphers in a fixed pattern, with the substitutions changing after every letter. Although the shifts in substitution are relatively simple, I do wonder at just how individuals were able to crack these codes without detailed knowledge of the machines. Even with them, and even if you knew the rotors that were used, without the keys, there are still many permutations to consider. Apparently one of the main reasons that the Enigmas were eventually broken was down to human error e.g. many messages were initiated (or signed off) with the same common text, or some same messages were sent multiple times but with different encryption settings, thus enabling their decryption.
The Enigma we’re modelling was comprised of several components each with unique (and here, slightly simplified) behaviours: –
- Three rotors. Each rotor linked to another rotor, and acted as a substitution matrix e.g. The 1st character on one rotor connected to 5th character on the adjacent rotor. Before each key press, the first rotor would cycle to the next position. After passing a specific letter (known as a “Knock On”), the next rotor would also cycle forward one notch.
- A reflector. This took a signal from a rotor, performed a simple substitution that was guaranteed not to substitute any letter to itself, and sent it back to the rotors. These rotors would then process backwards, performing another set of three substitutions, but using the reverse cypher.
- A plugboard. This acted as an additional substitution layer for mapping letters bi-directionally e.g. A <-> B, C <-> D etc.
Here’s how a single character would flow through the machine to be encoded: –
After each “stage” in the pipeline, the character would be substituted for another one, so by the time you finish, there have been nine separate substitutions and you end up with the letter “I”. Because of the nature of the rotors, of which at least one of them moves after every keypress, sending the same letter again immediately afterwards would not generate the same output.
You could also configure the Enigma in several ways: –
- There were nine rotors, each hard coded with different substitutions, and three rotor sockets on an Enigma; thus many combinations existed for permutations depending on which rotors were inserted.
- There were two reflectors in wide operation, one of which would be used.
- The plugboard would be used to perform an initial substitution from the letters on the keyboard.
- Each rotor could be given an arbitrary starting position (1-26).
- Each rotor could be given a specific offset (ring setting) which would also apply to the substitution.
Mapping the problem into code
So, enough about the Enigma itself – how do we map this into F#! Well, to start with, here’s a simple set of types to model our (slightly dumbed down) domain: –
And that’s all we need to model the system. Note the use of single-case Discriminated Unions to provide a way to easily wrap around primitive types that are used in multiple places e.g. RingSetting and WheelPosition. Using these not only guides the compiler, but also allow us to infer usage based solely on the type being used – we don’t need to rely on variable names.
Composing functionality together
What’s interesting is how in F# you can get good results very quickly by simply starting with small functions and not necessarily worrying too much about the big picture. Once you have these small bits of functionality, you can compose them together to build much more powerful systems. Look at the pipeline diagram above, and then map that to this code below: –
Notice how closely the code above and the previous diagram map to one another. Indeed, all of these functions have identical signatures i.e. char -> char. This makes perfect sense when you consider the task of each stage of the pipeline – give me a char, and I’ll give you another one back out. You could even view the pipeline as list of (char -> char) functions that you can fold with a single character to get a result out.
Having created this simple function to translate a single character, we can now compose this function into one that can translate a whole string of characters: –
Notice how although we need to track state of the Enigma (to manage wheel rotations after each translated character) that we don’t mutate the actual enigma that was provided; rather, internally we create copies with the required changes on each pass. Once we’ve completed the capability for encrypting an entire string, we can easily build up into an easy-to-consume API: –
Isaac Abraham (@isaac_abraham) December 24, 2014
Please feel free to have a look at the full source code here. Things to take away: –
- Problems can often easily be solved through a bottom-up approach, naturally allowing a higher-level design and API to present itself.
- Composition and partial function application are key enablers to easily manipulate functions together.
- A large part of the Enigma process can essentially be seen as an ordered set of pure char -> char functions.
- F# has a concise type system that allows us to express our domain in just a few lines of code.
Also check out the use of fscheck within the set of unit tests as a way of testing specific properties of the Enigma machine with a large set of data!