There seem to be a number of posts out there on how to use an SignalR with an IoC container e.g. MS Unity. Nearly all of them of them seem to be taking a sledgehammer approach to solve what most people generally want to do, which is create their Hubs with an IoC container. They generally don’t want to replace all of SignalR’s internal dependencies.
The easiest way to get dependencies injected into SignalR hubs is not by creating your own DefaultDependencyResolver – doing that will hand over control to you for creating not just Hubs, but basically all the different components within the SignalR pipeline. Worse still, for an IoC container like Unity which can create concretes that have not explicitly been registered, it can make life much more complicated.
A simpler approach is simply to register an implementation of the IHubActivator, as below: -
The HubActivator will get called only when SignalR needs to create a Hub specifically; you hand control over to your IoC container to create it, along with any of its dependencies. Much easier than the other approach, and easier to reason about.
I plan on blogging a bit more about my experiences with writing Type Providers in general as there’s a dearth of material readily available online. At any rate, after several false starts, I now have a moderately usable version of a Azure Blob Storage type provider on GitHub!
It’s easy to consume – just create a specific type of account, passing in your account name and key, and away you go!
Using Type Providers to improve the developer experience
It’s important to notice the developer experience as you dot your way through. First you’ll get a live list of containers: -
Then, on dotting into a container, you’ll get a lazily-loaded list of files in that container: -
Finally, on dotting into a file, you’ll get some details on the last Copy of that file as well as the option to download the file: -
It’s important to note that the signature of the Download() function from file to file will change depending on the extension of the file. If it’s a file that ends in .txt, .xml or .csv, it will return Async<string>; otherwise it’s an Async<Byte>. This is completely strongly typed – there’s no casting or dynamic typing involved, and you don’t have to worry about picking the correct overload for the file :-). This, for me, is a big value add over a static API which cannot respond in this manner to the contents of the data that it operates over – and yet with a type provider it still retains static typing!
I think that this type provider is somewhat different to others like the excellent FSharp.Data ones, which are geared towards programmability etc. – this one is (currently) more suited to scripting and exploration of a Blob Storage account. I still need to make the provider more stable, and add some more creature comforts to it, but I’m hoping that this will make peoples lives a bit easier when you need to quickly and easily get some data out of (and in the future, into) a Blob store.
Since starting to deliver my “Power of F#” talk to user groups and companies (generally well received – I hope), and getting involved in a few twitter debates on F#, I’ve noticed a few common themes regarding why .NET teams aren’t considering trying out F# to add to their development stack. Part of this is the usual spiel of misinformation of what F# is and is not (“it’s not a general purpose language”), but another part of it comes from a conservatism that really surprised me. That is, either: -
- There’s no / limited Resharper / CodeRush support for F#. Ergo, I won’t be able to develop “effectively” in it
- If I start learning it, no-one else in my team will know what I’m doing and so we can’t start using it
Now allow me to attempt to debunk these two statements.
Resharper (or CR) support is a non-issue for me. Personally, I use CodeRush over Resharper, but let’s be honest about what both of these are: Tools to speed up development time. Now, some of the issues that they solve aren’t as big an issue in F# as in C#. Perhaps the one I do miss the most is rename symbol, but others like Extract to Method aren’t as big a problem in F# due to the extremely lightweight syntax, type inference and the fact that it’s an expression-oriented language. So, it’d be nice to have some support in the language for refactorings, for sure – but it should absolutely not be a barrier to entry.
The “team training” issue is a more serious one to my mind, primarily because it’s about the individual’s perception of learning a new language rather than some arbitrary piece of tooling. Trust me when I say you can be productive in F# in just a few days if you’re coming from C# or VB .NET – particularly if you’ve used LINQ and have a reasonable understanding of it (and it’s constituent parts in C#3). Cast your mind back to when people started adopting .NET from, say, VB6. Was there no training issue then? Or from C++? Learning F# is far easier – the whole framework is the same as C# – it’s just the plumbing that orchestrates your calls to the framework that look a little different.
There are certainly a number of features in F# which don’t really have direct equivalents in C#, and to get the most out of the language you’ll need to do a bit of learning to understand new concepts – just like, say, C#2 to C#3 . I would say that F# is in a sense a higher-level superset of C# - you can do pretty much everything you would in C# albeit in a different, terser syntax, plus a whole load of other bits and pieces which you can opt into as you get more comfortable with the language.
As developers, we need to remain open minded about the art of programming. It’s what allows us to use things like LINQ rather than simple for each loops. It’s also what allows us to say “yes, using white space to indicate scope instead of curly braces isn’t necessarily a bad thing” or “yes, the kewords ‘let’ and ‘do’ aren’t inherently evil in a programming language”. Keep an open mind and try something new – you might actually be pleasantly surprised!
Just a quick post and update on my review on JustMockLite from earlier this year. I had originally a few comments on some features which I’m pleased to say have now been rectified
Support (or lackof) for recursive mocks was one of my main criticisms with earlier versions of JML. For example, if you have a mock which itself had a method needed to return another mock – or worse still, needed to mock the result of a method on that child mock, it was a bit of a pain; you had to manually construct the child mock, and then arrange the top level call to return the child mock etc. etc. etc.
This simple code sample illustrates how recursive mocks are now extremely simple to do in JML. Child mocks are now automatically created without the need to explicitly create one, and you can chain a method call expression when arranging the result of a nested mock. Very nice.
This is a small but important feature for getting up to speed quicker – JML now includes comments on methods etc., which should aid in getting up and running without having to resort to the documentation.
This is all really good. I’d still love to see ignoring of arguments by default on method call arrangement, but overall JML continues to improve – definitely recommended.
A bit late in the day, but here’s a quick-and-simple demonstration of the fundamental difference between single-threaded, multi-threaded and asynchronous coding patterns. To illustrate this, here’s a simple task that you’ve no doubt seen a million times before – download a number of resources over HTTP. The code for this demo is available in both C# and F# forms (side note: I know that the code samples are not identical in their approach – the C# one uses Parallel.ForEach and has manual (pre-C#5) async whilst the F# one use map and iter, doesn’t do console colouration and uses “proper” async – but the F# sample is about 50% of the size of the C# one…).
The simplest way to do this would be to do this using a for each loop (or a simple Select over a collection of URIs), using the DownloadString() method on WebClient. This will give you something like as follows: -
As you can see, downloading each URL is performed in series, one the same thread.
We can perform the same work in parallel, either as a Parallel.ForEach (useful when you have no outputs, just side-effects), or using TPL e.g. LINQ’s AsParallel() for a more functional approach. Again, though, we use DownloadString() on WebClient to retrieve the data. There is no asynchrony utilised, just multi-threading.
Now you can see that we are downloading all resources in parallel, all on different threads. So, if one URL is performing slowly, this won’t prevent the others downloading at the same time. However, notice that there is an affinity between URL and thread ID i.e. TottenhamHotspur.com is initiated and handled by thread 1. CockneyCoder is initiated and handled by thread 4. This is because the thread is blocked whilst waiting for the resource to download. This is a crucial point – whilst your resource is downloading the thread cannot be used for anything else.
Also notice – Parallel.ForEach will automatically join for us so that when the next line after the foreach executes, all code inside the “loop” has been completed (assuming that within the foreach no background tasks were initiated!).
With this approach, instead of explicitly spinning up Tasks (which generally relate to Threads), we use the DownloadStringTaskAsync() method on WebClient. This starts the download of the resource and returns a Task, which we can report on to see whether it has completed or not. Here’s the results of this approach: -
So, you can see that we still spawn all the downloads in parallel. However, notice that there is no longer this “stickyness” on thread with URL. TottenhamHotspur.com started downloading on one thread, but returned on another. This is because once the download has initiated, the thread is freed up for other processing. This is a subtle but fundamental difference with the approach above, where you are locking a thread for the entirety of the download; the asynchronous approach is far more scalable.
If you look at the actual code though, you will see that we have introduced a small complexity in that we have now added a callback event handler; this is because the callback happens on some other thread at some point in the future. In C#5, we can also await the task, so that we can write imperative-style code and not have to mess around with callbacks etc., but in the sample I’ve purposely not introduced async / await for the sake of simplicity – what I want to illustrate is the distinction of thread management when writing asynchronous code.
Parallelism and Asynchrony are often used interchangeably. This is unfortunate, because the two are somewhat orthagonal; you can write code that is parallel and not asynchronous. Alternatively, you can write code that is sequential but also asynchronous. Be aware of this important distinction, and decide when writing your code whether you need one or both of them – and what tools at the language level you have: in C#, you have async/await for making async code more readable; F# has async workflows. When operating over collections, you can also use LINQ’s AsParallel() method, as well as Parallel.ForEach(), although you will of course need to be cautious of shared state with the latter.
What is Memoization?
Just a simple demo outlining how you might write a simple generic callsite cache in F# aka Memoize. This is something that you can find elsewhere on the web – it’s easy to write one. What I really want to illustrate is if you were to write something similar in C# how much more boilerplate it would be. I’ve actually written a more fully-featured cache Call Handler for Unity which works with any number of arguments, but trust me when I say that it was a fair amount of work to do. And when you’re working with mutable data structures it’s difficult to know what you can do with them with regards to e.g. putting them in hash tables etc. etc. (as hash values on mutable objects can potentially change over time, but not so with immutable structures).
Memoize in C#
Anyway… here’s an example in C# of a higher-order function that takes in a given function and will wrap the call in a cache, and first some sample calling code to show how we consume it.
Ouch! This isn’t the most readable code in the world (although I tried my best :-)) On the client, we need to explicitly supply the generic type arguments so that the Memoize function knows what type of cache to create. I had hoped that the compiler could infer this based on the signature of the method being supplied, but sadly not. Also, because the cache we’re using just works on a single argument, we have to supply all values as a single Tuple, so rather than just calling Add(5,10), we have to call Add(Tuple.Create(5,10)). This isn’t great. We could try to change the way the cache works to take in multiple arguments, but there would be limitations and it wouldn’t be truly generic.
The implementation of the cache isn’t that much better. We’re lost in a sea of TArgs and TResults for the method signature, and also have one of the dreaded out parameters for our TryGetValue from the cache. Otherwise it’s fairly bland – see if the value is in the cache; if it is, return it, otherwise call the supplied “real” code, and cache that result before passing it back out. Pretty basic chain of responsibility.
Memoize in F#
Here’s nearly the same code but in good old F#!
So this code boils down to pretty much the same as the C# sample, except that all the fluff is gone. From the caller’s point of view, we declare our add function, and then simply call memoize and pass it in. No need for generic arguments, as the F# compiler is smart enough to infer them. We also create a Tuple in F# with syntax that appears to be more like C# method arguments i.e. (5,10). This is succinct, lightweight and easily readable.
For the implementation of the cache, it’s also much cleaner. Firstly, there are no generic type arguments. In fact, there are no explicit types at all except for the Dictionary. F# also handles “out” parameters much more elegantly than C#, so TryGetValue can be easily called and consumed as a single expression.
This was just a fairly short and simple code sample illustrating how type annotations can sometimes quickly get out of control. Having automatic generalisation of functions lets us concentrate on the core functionality of what we’re trying to achieve – the F# version is around 50% of the size of the C# one, but does the same thing. There’s also an example of how nicely F# interoperates with .NET’s out parameters by returning them as part of a Tuple.
Entity Framework 6 is coming!
Entity Framework 6 is now ramping up for a release. It brings nice Async functionality, but also gives lots more power to the Code First capability, as well as also bringing EF completely out of the core .NET framework – it’s instead now a standalone NuGet package. So, the story now for creating and modifying your SQL database within EF can be considered to use these four features: -
- Get hold of EF via a NuGet package
- Create a physical data model based on an object model
- Update and evolve your database through code-first migrations
- Create a database quickly and easily using LocalDB
No dependency on .NET 4.5, or on a specific version of Visual Studio, or on having a running instance of SQL Server (even SQL Express) as a Windows Service. This sounds all great. We can now create a physical database, create migrations from one version to the next, and write queries in the language that automatically get translated into SQL. In effect, EF6 with code-first allows you to write an application without having to ever look at the database. This may be a somewhat short-sighted or simplistic view – you’ll always need to optimise your DB with indexes and perhaps SPs etc. for high performance corner cases etc. – but for quickly getting up and running, and I would suggest for a fair proportion of your SQL database requirements, EF6 as a framework will sort you out .
This is all great – the barrier to entry to getting up and running in a short space of time is extremely low. However, I would then raise the question: if you’re using your database in a fashion akin to that above – where it’s just a means to an end – why use a SQL database at all? What are you actually using SQL for? As a way to store and retrieve data into your application. Your queries are written outside of SQL in e.g. LINQ. Your tables are generated outside of T-SQL. You never interact with tables because EF is acting as an ORM that does all the funky mapping for you. So what exactly are you getting from using SQL?
The point is that you can remove almost all of the impedance mismatch between your object model and your data-store in a much simpler way simply by using something like MongoDB. You don’t have a formal schema as such in the database – you simply read and write object graphs to and from a document. The concept of a “mapping” between DB and object model are almost entirely removed. There’s no need to formally upgrade a “schema” e.g. foreign keys, indexes, columns etc. etc – there’s your object model, and nothing else.
I’m not suggesting you go and dump SQL tomorrow if you’ve already invested in it. What I would suggest is that the next time you create a new database you consider why you use SQL and whether you could achieve the same use cases by using a schema-free document database like MongoDB or RavenDB. What are you really using your database for today?
Java and C#
I want to start by first doing a quick comparison between languages on the JVM and .NET. The main differentiators that I see between the JVM and .NET are twofold: -
The JVM is much more common on non-Windows servers e.g. LAMP stacks. Yes, there’s Mono, which is becoming more popular, but generally this domain is taken by Java. Between the two most popular languages across the two platforms – Java and C# – Java has slowly stagnated over the years, caused apparently by over-deliberating by committee on details of various features whilst C# has gone steaming ahead and introduced genuinely revolutionary features to OO developers such as queries and asynchronous programming as first class citizens, and also functional programming constructs.
Whilst Java (and indeed the JVM) has a fundamentally limited implementation of generics which will likely never change, C# and .NET have generics baked throughout the runtime. Whilst Java 8 will (in 2014) finally get lambda expressions (something which I remember Java developers passionately castigating in 2008 when C#3 first came out), again, because it doesn’t have a notion of a delegate, it won’t be as elegant as the C# implementation, involving (as I understand it) compiler tricks to turn lambdas into single method interfaces.
One thing both Java and C# do have in common is that they are rapidly becoming languages that will be prevalent on the server-side, notwithstanding e.g. internal LOB applications or native mobile applications (even that’s a bit of a hot potato with things like Xamarin – maybe best left for a discussion on another day).
The good, the bad and the ugly of C#
So C# has given us developers some pretty good things. Yet in a sense, since C#3 came out, the language hasn’t moved forward that much. Dynamic was introduced, we got better co- and contra-variance, and C#5 introduced async/await. All excellent features (particularly the last one), but where do we go from here? I’ll go back to my comment about lambda expressions on Java again here. Some of the main criticisms of LINQ when it first came out by the Java community were driven by emotion rather than rationale – that LINQ was just a poor man’s SQL embedded into the language; that C# was now a dynamic language; that extension methods would somehow break encapsulation; that LINQ was just Microsoft’s latest attempt to get new customers and it “wouldn’t work” etc. etc. All nonsense, yet one underlying message from the complaints was that lambdas and LINQ would be too confusing for developers. To an extent I think that this is more of an issue in the Java community, where I see the language being far more OO than C# is nowadays.
But, at the same time, think about a new developer learning C#. There are already many ways to achieve the same thing at both the language and framework level e.g. set based operations, asynchronous operations, multi-threading etc. etc.. Think how many constructs and fundamentals that there are to learn these days; you’d probably get contradictory information and advice from ten experienced .NET developers that have picked up different bits over the years. At least we’ve been drip-fed C# over the years at a rate we can absorb whereas a new developer will have to try to distil that information to a manageable amount. It’s harder and harder to know the “right” way of doing things.
There are some things in the language that we’ll probably never get – things like non-nullable reference types, which would eliminate whole classes of bugs, or the removal of leftovers from .NET 1 that were replaced in .NET 2 like non-generic collections etc., or the multiple-ways-to-do-async etc. etc.
Where does C# go from here? Do the designers continue to add features to it e.g. immutability / more succinct type declarations (something I’ve been hearing might well happen) / improved type inference (something I would love to see)? As each revision of C# goes in, it becomes harder and harder to add new features to it. On the other hand, languages like Scala on the JVM and F# on the .NET framework are changing at a much more rapid pace. Ironically, Scala, to me, seems to me more like C# than F# with its objects-first-but-functional-programming-supported approach, but both it and F# are introducing features to their respective communities.
Is there a case to perhaps keep the changes in C# at a relatively sedate pace now, before it becomes even more complex, and let e.g. F# push the boundaries with new features? What features do we need to see added to C# to improve the programming experience without making the language a mish-mash? Added to this is Rosyln – what impact will that have on the language going forward?
I suppose the main upshot is that I think the days of mega-changes in C#, like we saw in C#3, are over. What I suspect we’ll see in the future are most likely carefully-thought-out smaller features which can improve the experience or perhaps deal with specific problem domains but won’t necessarily change the fundamentals of C# as we saw with LINQ.
Had an interesting discussion on twitter regarding storing relational data and how to essentially pre-render joins across data to improve performance at the cost of storage space. To my knowledge, this isn’t really possible, and goes against the ethos of relational databases anyway i.e. avoid duplication of data, one record for any bit of information etc. etc.. Anyway, the point being made was that storage space is dirt cheap these days, whereas CPU time isn’t.
A lot of people don’t really think of “data” and “compute” as two sides of the same coin, but they really are. Think of the example above – joining two (or more) tables together in SQL is always going to be slower than going to a single table where the data is already pre-calculated. With document databases this sort of approach is more common, where you might have many ways of representing the same data, and indeed the same pieces of data duplicated across many documents, but it’s often frowned upon in the SQL world.
CPU and Data costs in Azure
So, I thought to illustrate the disparity between data and CPU I would just show some costs of hosting data and compute in Azure (at the time of writing) if your budget is £100 / month: -
|Relational||SQL Web / Business||1 x 83GB database|
|2 x 26GB databases|
|3 x 13GB databases|
|CPU||Extra Small (1ghz, 768mb)||10 Instances|
|Small Instance (1.6ghz, 1.75gb)||2 Instances|
|Medium Instance (2 x 1.6hz, 3.5gb)||1 Instance|
Look at those numbers.
You can have a single database upto 83GB in size for £100 OR you can have 2.3TB of Blob / Table / Queue storage. That’s an absolutely ridiculous amount in comparison – over 20x as much. Of course, with Table storage you don’t get a schema as such, and have to worry about partitioning data etc. – it’s nowhere near as easy to get up and running as a SQL database that we all know (and some of us love). Nor can you do any compute on the table store as you can with SQL e.g. stored procedures, joins, group by etc. etc. But for purely data storage, there’s no comparison. Alternatively, compare our 2TB+ of schema-free data with some CPU prices – two small instance VM (or web / worker role) will go for the same amount, or just a single medium instance. I can’t even include SQL premium or more than a medium instance VM in the above table as they clock in at over £100 a month.
Future compute can be traded for upfront data – the latter is relatively cheap compared to the former.
You can think of this in pure functional terms e.g. a function Factorial which takes in a number and returns the factorial of it – you could either calculate that on demand, or calculate it once-off and then store those results in a table. The cost of hosting those results is tiny compared to the cost of repeatedly calculating them on demand.
Now obviously, the above results never change; a more real-world example might be to imagine that you have a system where you need to host two worker roles to handle your compute demands that involves calculations on some data (e.g. website queries to find products that meet some filter etc.). Why not consider pre-calculating those results where possible and storing them in an Azure Table instead? A batch process could update the aggregations where required in order to keep these results as fresh as possible. Even if the table takes up e.g. 50gb – this is a mere pittance compared to the cost of a single small worker role which equates to roughly 1TB of table storage in terms of price. Or compressing data to “save” on disk space at the cost of the CPU cycles required to decompress it every time you want to read it back out?
Makes you think a little bit (at least, it did for me)!
So, a couple of days ago I had a good discussion on Twitter with a developer that I respect regarding the future of JS. This was in response to this article, which basically suggests that within the next few years JS will become the defacto programming language on both the client and server. Now, I’ve read that article, and re-read it, and re-read it again. I still don’t agree with it. Let’s discuss both client and server sides as separate entities…
There are already three programming languages out there that compile down to JS – CoffeeScript (CS), TypeScript (TS) and Dart. Dart has slightly loftier goals in that it sees itself in the long-term as a complete replacement for JS, with it’s own interpreter / compiler in the browser, but in the short-term it fulfils the same purpose as the other two. CS uses a completely different syntax to JS, whilst TS is a superset of JS, meaning that the barrier to entry is extremely low. At any rate, the fact that these three languages exist tells us something – that JS has problems writing large-scale applications. It has poor facilities for abstraction, module discovery and generally for reasoning about your code that allows you to do simple things like rename a variable with any degree of certainty. Developers at places like Google have had to come up with ridiculous naming standards of variables and classes etc. to allow people to infer scoping or usage of types etc.. I thought we dropped Hungarian notation with it’s szCustomerName ridiculousness for good in the 90s, but evidently not.
So where am I going with this rant? Simply that whilst JS will almost certainly stay as the most popular language that web applications run on, it won’t be the dominant programming language for developers. I certainly wouldn’t want to write the next Facebook, Gmail or whatnot in plain JS. It’s a nightmare to maintain and, without basic constructs like interfaces, classes and modules doesn’t easily allow for organising code into manageable chunks. I’m not saying it can’t be done, but it involves lots of extra work by the development team – things that should be given to us by a modern language (cast your mind back to what JS was originally designed for - it certainly wasn’t to write Google Maps).
On the server-side things get even more bizarre. For starters, not only do you have all the same issues as above (which in todays world are likely to become even more of an issue as big data problems get both more complex and commonplace), but a whole host of other issues rear their head.
Firstly, server side applications tend to be much more mission critical than a front-end application from a data point of view. A bug on your website may only result in displaying some data incorrectly – this is a transient issue that can be fixed with no permanent damage. Conversely a bug on the server may end up actually calculating the wrong data and persisting it to your data store. Because of the nature of server-side applications, you might not even notice this until much later. Indeed, with JS, issues like accessing the wrong field because of a typo don’t cause a runtime error - it’ll just carry on happily. More so, with the relatively immature nature of JS testing frameworks, these sorts of errors might be even more prevalent. And I don’t want to generalise too much, but how many front-end JS developers would be confident writing a server-side application in a test-first manner?
What about features that you would expect to see in a language that was going to be used to write a mission-critical server-side application? How about excellent multithreading support? No, wait, how about any multithreading support? What about simple to use and powerful asynchronous support (I’m talking like that in C#, F# and I believe Python have now)? The ability to know with certainty where a particular object is being used? I just don’t understand why you would want to give up all of those language features, let along features like test frameworks – or tooling - that people have grown accustomed to.