A refresher on Async


A bit late in the day, but here’s a quick-and-simple demonstration of the fundamental difference between single-threaded, multi-threaded and asynchronous coding patterns. To illustrate this, here’s a simple task that you’ve no doubt seen a million times before – download a number of resources over HTTP. The code for this demo is available in both C# and F# forms (side note: I know that the code samples are not identical in their approach – the C# one uses Parallel.ForEach and has manual (pre-C#5) async whilst the F# one use map and iter, doesn’t do console colouration and uses “proper” async – but the F# sample is about 50% of the size of the C# one…).

Sequential execution

The simplest way to do this would be to do this using a for each loop (or a simple Select over a collection of URIs), using the DownloadString() method on WebClient. This will give you something like as follows: –

Image

As you can see, downloading each URL is performed in series, one the same thread.

Parallel execution

We can perform the same work in parallel, either as a Parallel.ForEach (useful when you have no outputs, just side-effects), or using TPL e.g. LINQ’s AsParallel() for a more functional approach. Again, though, we use DownloadString() on WebClient to retrieve the data. There is no asynchrony utilised, just multi-threading.

Image

Now you can see that we are downloading all resources in parallel, all on different threads. So, if one URL is performing slowly, this won’t prevent the others downloading at the same time. However, notice that there is an affinity between URL and thread ID i.e. TottenhamHotspur.com is initiated and handled by thread 1. CockneyCoder is initiated and handled by thread 4. This is because the thread is blocked whilst waiting for the resource to download. This is a crucial point – whilst your resource is downloading the thread cannot be used for anything else.

Also notice – Parallel.ForEach will automatically join for us so that when the next line after the foreach executes, all code inside the “loop” has been completed (assuming that within the foreach no background tasks were initiated!).

Asynchronous execution

With this approach, instead of explicitly spinning up Tasks (which generally relate to Threads), we use the DownloadStringTaskAsync() method on WebClient. This starts the download of the resource and returns a Task, which we can report on to see whether it has completed or not. Here’s the results of this approach: –

Image

So, you can see that we still spawn all the downloads in parallel. However, notice that there is no longer this “stickyness” on thread with URL. TottenhamHotspur.com started downloading on one thread, but returned on another. This is because once the download has initiated, the thread is freed up for other processing. This is a subtle but fundamental difference with the approach above, where you are locking a thread for the entirety of the download; the asynchronous approach is far more scalable.

If you look at the actual code though, you will see that we have introduced a small complexity in that we have now added a callback event handler; this is because the callback happens on some other thread at some point in the future. In C#5, we can also await the task, so that we can write imperative-style code and not have to mess around with callbacks etc., but in the sample I’ve purposely not introduced async / await for the sake of simplicity – what I want to illustrate is the distinction of thread management when writing asynchronous code.

Conclusion

Parallelism and Asynchrony are often used interchangeably. This is unfortunate, because the two are somewhat orthagonal; you can write code that is parallel and not asynchronous. Alternatively, you can write code that is sequential but also asynchronous. Be aware of this important distinction, and decide when writing your code whether you need one or both of them – and what tools at the language level you have: in C#, you have async/await for making async code more readable; F# has async workflows. When operating over collections, you can also use LINQ’s AsParallel() method, as well as Parallel.ForEach(), although you will of course need to be cautious of shared state with the latter.

Advertisements

3 thoughts on “A refresher on Async

    1. Przemysław,

      This is so that all the requests for URLs start in parallel. This was exactly my point in the final conclusion – if you simply do a foreach (or a Select – which is basically very similar to a foreach under the bonnet) instead of a Parallel ForEach (or AsParallel().Select()), you’ll notice that although all the requests come back on different threads, they will all initialise on the same thread (most likely thread 1). This is because the code to start each download still runs in sequence, although the results effectively happen in parallel.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s