Meet Entity Framework, the Anti-SQL Framework

Entity Framework 6 is coming!

Entity Framework 6 is now ramping up for a release. It brings nice Async functionality, but also gives lots more power to the Code First capability, as well as also bringing EF completely out of the core .NET framework – it’s instead now a standalone NuGet package. So, the story now for creating and modifying your SQL database within EF can be considered to use these four features: –

  1. Get hold of EF via a NuGet package
  2. Create a physical data model based on an object model
  3. Update and evolve your database through code-first migrations
  4. Create a database quickly and easily using LocalDB

No dependency on .NET 4.5, or on a specific version of Visual Studio, or on having a running instance of SQL Server (even SQL Express) as a Windows Service. This sounds all great. We can now create a physical database, create migrations from one version to the next, and write queries in the language that automatically get translated into SQL. In effect, EF6 with code-first allows you to write an application without having to ever look at the database. This may be a somewhat short-sighted or simplistic view – you’ll always need to optimise your DB with indexes and perhaps SPs etc. for high performance corner cases etc. – but for quickly getting up and running, and I would suggest for a fair proportion of your SQL database requirements, EF6 as a framework will sort you out .

Why SQL?

This is all great – the barrier to entry to getting up and running in a short space of time is extremely low. However, I would then raise the question: if you’re using your database in a fashion akin to that above – where it’s just a means to an end – why use a SQL database at all? What are you actually using SQL for? As a way to store and retrieve data into your application. Your queries are written outside of SQL in e.g. LINQ. Your tables are generated outside of T-SQL. You never interact with tables because EF is acting as an ORM that does all the funky mapping for you. So what exactly are you getting from using SQL?

The point is that you can remove almost all of the impedance mismatch between your object model and your data-store in a much simpler way simply by using something like MongoDB. You don’t have a formal schema as such in the database – you simply read and write object graphs to and from a document. The concept of a “mapping” between DB and object model are almost entirely removed. There’s no need to formally upgrade a “schema” e.g. foreign keys, indexes, columns etc. etc – there’s your object model, and nothing else.


I’m not suggesting you go and dump SQL tomorrow if you’ve already invested in it. What I would suggest is that the next time you create a new database you consider why you use SQL and whether you could achieve the same use cases by using a schema-free document database like MongoDB or RavenDB. What are you really using your database for today?

Why Entity Framework renders the Repository pattern obsolete?

A post here on a pattern I thought was obsolete yet I still see cropping up in projects using EF time and time again…

What is a Repository?

The repository pattern – to me – is just a form of data access gateway. We used it to provide both a form of abstraction above the details of data access, as well as to provide testability to your calling clients, e.g. services or perhaps even view models / controllers. A typical repository will have methods such as the following:-

interface IRepository
    T GetById(Int32 id);
    T Insert(T item);
    T Update(T item);
    T Delete(T item);

interface ICustomerRepository : IRepository
    Customer GetByName(String name);

And so on. You’ll probably create a Repository<T> class which does the basic CRUD work for any <T>. Each one of these repositories will delegate to an EF ObjectContext (or DbContext for newer EF versions), and they’ll offer you absolutely nothing. Allow me to explain…

Getting to EF data in Services

Let’s illustrate the two different approaches with a simple example service method that gets the first customer whose name is an arbitrary string. In terms of objects and responsibilities, the two approaches are somewhat different. Here’s the Repository version: –

public class Service
    private readonly ICustomerRepository customerRepository;
    public Customer GetCustomer(String customerName)
        return customerRepository.GetByName(customerName);
public class CustomerRepository : ICustomerRepository
    private readonly DatabaseContext context;
    public Customer GetByName(string customerName)
        return context.Customers.First(c => c.Name == customerName);

Using the Repository pattern, you generally abstract out your actual query so that your service does any “business logic” e.g. validation etc. and then orchestrates repository calls e.g. Get customer 4, Amend name, Update customer 4 etc. etc.. You’ll also invariably end up templating (which if you read my blog regularly you know I hate) your Repositories for common logic like First, Where etc.. – all these methods will just delegate onto the equivalent method on DbSet.

If you go with the approach of talking to EF directly, you enter your queries directly in your service layer. There’s no abstraction layer between the service and EF.

public class ServiceTwo
    private readonly DatabaseContext context;

    Customer GetCustomer(String customerName)
        return context.Customers.First(c => c.Name == customerName);

So there’s now just one class, the service, which is coupled to DatabaseContext rather than CustomerRepository; we perform the query directly in the service. Notice also that Context contains all our repositories e.g. Customers, Orders etc. as a single dependency rather than one per type. Why would we want to do this? Well, you cut out a layer of indirection, reduce the number of classes you have (i.e. the whole Repository hierarchy vs a fake DbContext + Set), making your code quicker to write as well as easier to reason about.

Aha! Surely now we can’t test out our services because we’re coupled to EF! And aren’t we violating SRP by putting our queries directly into our service? I say “no” to both.

Testability without Repository

How do we fix the first issue, that of testability? There are actually many good examples online for this, but essentially, think about this – what is DbContext? At it’s most basic, it’s a class which contains multiple properties, each implementing IDbSet<T> (notice – IDbSet, not DbSet). What is IDbSet<T>? It’s the same thing as our old friend, IRepository<T>. It contains methods to Add, Delete etc. etc., and in addition implements IQueryable<T> – so you get basically the whole LINQ query set including things like First, Single, Where etc. etc.

Because DBSet<T> implements the interface IDbSet<T>, you can write your own one which uses e.g. in-memory List<T> as a backing store instead. This way your service methods can work against in-memory lists during unit tests (easy to generate test data, easy to prove tests for), whilst going against the real DBContext at runtime. You don’t need to play around with mocking frameworks – in your unit tests you can simply generate fake data and place them into your fake DBSet lists.

I know that some people whinge about this saying “it doesn’t prove the real SQL that EF will generate; it won’t test performance etc. That’s true – however, this approach doesn’t try to solve that – what it does try to do is to remove the unnecessary IRepository layer and reduce friction, whilst improving testability – for 90% of your EF queries e.g. Where, First, GroupBy etc., this will work just fine.

Violation of SRP

This one is trickier. You ideally want to be able to reuse your queries across service methods – how do we do that if we’re writing our queries inline of the service? The answer is – be pramatic. If you have a query that is used once and once only, or a few times but is a simple Where clause – don’t bother refactoring for reuse.

If, on the other hand you have a large query that is being used in many places and is difficult to test, consider making a mockable query builder that takes in an IQueryable, composes on top of it and then returns another IQueryable back out. This allows you to create common queries yet still be flexible in their application – whilst still giving you the ability to go directly to your EF context.


Testability is important when writing EF-based data-driven services. However, the Repository pattern offers little when you can write your services directly against a testable EF context. You can in fact get much better testability from an service-with-an-EF-context based approach than just with a repository, as you can test out your LINQ queries against a fake context, which at least proves your query represents what you want semantically. It’s still not a 100% tested solution, because your code does not test out the EF IQueryable provider – so it’s important that you still have some form of integration and / or performance tests against your services.

Tips on writing an EF-based ETL

I’ve been working on a relatively small ETL (that’s Extract-Transform-Load) process recently. It’s been written in C# using EF4 and is designed to migrate some data – perhaps a million rows in total – from one database schema to another as a one-off job. Nothing particularly out of the ordinary there; the object mapping is somewhat tricky but nothing too difficult. There were a few things I’ve been doing lately to improve the performance of it, and I thought I’d share some of those with you.

Separate out query and command. This is probably the hardest thing to change after you get started, so think about this up front. What I mean by this is this: let’s say you write some code that navigates through a one-to-many object graph and for each child, fires off a query to the DB to retrieve some other data, and then acts on that bit of retrieved data. You then discover that sometimes, you’ll have 5,000 children in the graph, which equates to 5,000 queries being fired off to the DB. Instead of this, why not just write a single set-based query which performs a “where in”-style query to retrieve all the data you’ll need in one go. Then you can, in memory, iterate over each of them one at a time. This will give you a big performance boost. In order to do this, you need to be careful to construct your code such that you decouple the bit that queries the data to retrieve a collection of object and the part that operates on each single object one a time, and then have a controlling method which orchestrates the two together. Doing this design upfront is much, much easier than trying to do it afterwards. In essence, if you know up front what data you need to operate on, try to pull as much of that in together rather than doing it bit-by-bit.

Keep your object graphs as small as possible. By this I mean do not read in fields or elements of the graph that you do not need. If necessary, construct DTOs and use EF’s projection capabilities to construct them, rather than reading back an entire EF-generated object graph when you only need 5 properties out of the 150 available.

Use the VS performance profiler. Or ANTS etc. The best part of the VS perf tool is the Tier Interaction Profiler (TIP). This monitors all the queries you fire off to the DB and shows you stats like how many of them there were, how long they took etc. – great for finding bottlenecks.

Avoid lazy loading. It’s seductive but will again negatively impact performance by quietly hammering the database without you realising it.

Use compiled EF queries. For queries that you are repeatedly calling – especially complex ones – compiling them will give you a nice boost in performance.

Keep the “destination” EF context load as low as possible. In the context of EF, this means NOT doing things like using a single session for the entire process. It’ll just get bigger and bigger as you add stuff to it. Instead, try to keep them relatively short lived – perhaps one for x number of source aggregate that you process.

Use No-Tracking for read-only (source) EF contexts. This means you can essentially just re-use the same context across the whole application as the context just becomes a gateway to the DB, but nothing more.

Do not batch up too much in one go. The ObjectStateManager that tracks changes behind the scenes of an EF context is a glutton for memory. If you have a large graph to save, top even 500 or 1,000 insertions and you’ll see your memory footprint creeping up and up; calling SubmitChanges() on a regular basis can alleviate this (at least, that’s the behaviour that I’ve observed).

Separate out writing to reference and entity data. If you are inserting reference data, create an entirely separate context for it. Do NOT share the entities with your main “destination” entity model. Instead, just refer to reference data items by ID. The benefits of doing this are that you can much more easily cache your reference data without falling foul of things like attaching the same object into multiple contexts etc.

Ask yourself if you really need to use EF. There are many forward-only data-access mechanisms available in .NET that out-perform EF. For reading from the source database, you could use something like Dapper or Massive intead of EF. I can’t comment on the performance of Massive, but Dapper is certainly extremely quick. You will lose the ability to write LINQ queries though, and will have to manually construct your entire source database domain model. Again though, that may not be such a bad thing if you design DTOs that work well in a set-based query environment.

Creating a testable WCF RIA Domain Service – Part 2

I posted a few weeks ago about writing a testable LinqToEntities domain service and came up with a number of options. The last one was my preferred choice i.e. use of the new keyword to “overwrite” the real context on the base class with an interface that your context implements. I’ve now refined it to create a very simple class that I call the MockableDomainService.

Here’s how it works: –

1. Create a mockable EF object context

This has been covered in many places on the net so I won’t bore you with the details; instead I’ll point you here again and show a class diagram of how your mocked context might look:


2. Copy the code for MockableDomainService

MockableDomainService inherits from LinqToEntitiesDomainService, but unlike that service class, this one is not as closely coupled to your ObjectContext. Instead it has two generic parameters – the ObjectContext (in our example, BusinessContext) and the testable interface that it implements (IBusinessContext).

You can download the code here, but here’s the code (I’m pasting the code as a screenshot since I still can’t figure out how to gracefully paste code into WordPress with Windows Live Writer…)


3. Inherit from MockableDomainService

Now take your real domain service, and inherit from MockableDomainService instead of the LinqToEntitiesDomainService. In your production code, simply create it with the default constructor. When you want to test your domain service, create it passing in your Fake object context instead. All your Domain Service queries will now execute against that instead – job done!

There’s still things that need work on this mockable domain service – you’ll still need to get it to handle change sets and the like – basically all the other stuff that LinqToEntities domain service gives you – but for getting going and at least testing out your queries, this should work just fine.

Testing out the Entity Framework WCF RIA Services Domain Service

I’ve started using RIA Services lately and have been testing it out with an EF4 back-end. Generally I’m quite impressed with it, as it gives you several things out of the box that you normally would need to spend time coding otherwise: –

  • Hosting of the WCF service
  • IQueryable on the client
  • Batch updates through the client-side domain context
  • Change change of client-side entities
  • Change notification of client-side entities

The last two are particularly nice as they mean that you don’t have to do this work on your server-side domain model – RIA services automatically code gens up client-side types that do this automatically.

On the server side, you have a domain service which is essentially just a class that inherits from DomainService, which in turn acts as a smart WCF service, doing a lot of the work of hosting the service etc. for you for free. However, if you’re using EF4, you can inherit from LinqToEntitiesDomainService<TObjectContext>, which gives you some more things out of the box that you would normally have to code up for your domain service. This includes a property called ObjectContext which is the context used for accessing your data store.

Testing out your domain service

Unfortunately, the above EF Domain Service couples itself to the concrete object context, and so thus rules out testing your domain service or the queries that you execute in that service. I looked around and found precious few examples on how to test out a LinqToEntities Domain Service. There’s an example for testing out a LinqToSql service, but that’s it. So, here are a few options I came up with to allow you to test out your Entity Framework Domain Service.

Use the standard Domain Service

Here, you simply use the standard domain service rather than the EF one, which gives you the ability to inject your ObjectContext as you see fit.

Create a “regular” domain service, and reference an interface to your object context (lets call it IObjectContext) rather than a concrete object context (I suggest looking here for details on how to use IObjectSet as a way to mock the object context). You can inject the context manually or use an IoC framework to do it for you. Then, you write queries to your data source through the IObjectContext etc.


This solution functions, but you will lose any extra features that the LinqToEntitiesDomainService gives you – I’ll blog about this in the future.

Use the repository pattern to test your queries

In this example, your use the real EF Domain Service, but instead of writing queries directly in your service, you create a Repository class which takes in an IObjectContext and executes its queries there. You can test out your repository queries in the normal manner i.e. in your unit tests, create a fake ObjectContext and supply that to your repository class.


However, your cannot test out any of your domain service “business logic” code in MyService as this is still tightly coupled to the concrete ObjectContext. Only your queries against the context can be tested.

Put your real logic in a delegated class

Here, you create a class which will have the same method signatures as your domain service, and your domain service delegates all calls to it. Your domain service is a “proper” EF Domain Service, so you get all the benefits of lifecycle management etc. of the context, but still have the ability to fake out the context to make your code testable. In effect, your Domain Service becomes a proxy for the real business logic.

Create an interface for your Domain Service which has all the methods that you’re going to expose. Then, create another class which implements that interface and contains all your business logic and queries in. In essence, this is the “real” domain service, except it doesn’t inherit from Domain Service etc.. In your Domain Service class, you create an instance of this “delegated” service, passing in the concrete object context, and delegate all calls to it. Because the “delegated” service class is only coupled to IObjectContext, you can full test it out, yet you get the benefits of EFDomainService because that class is still created.


However, this is possible the most complicated of all the different ways of abstracting away the context, and you have a somewhat ugly manual delegation of calls going on – and for every new domain method you create, you need to update the interface etc., so it’s a bit of work.

Shim in a fake context

In this scenario, you simply create a new property on your EF Domain Service which overrides and thus hides the “real” Object Context. The new property is the interface; all access to the context goes via this property rather than the base ObjectContext.


With this fairly lightweight approach, you create a property with the same name as the “real” context that exists on the base class, ObjectContext, and use the new keyword to override it. You then use constructor-based injection to use a fake context when required; otherwise you use the base ObjectContext.


With this approach, as long as you remember to never use base.ObjectContext, you can effectively shim in a fake context without any need for proxies or repository classes; your service is still fully testable and you get all the goodness of the LinqToEntities Domain Service.

NuGet, EF4.1 and SQL Compact 4

As I’m waiting around this week for furniture to be delivered to my new abode, I was passing the time today by trying out EF4.1. I thought it’d also be a good opportunity to try out NuGet in order to see how easy it is to download packages + dependencies etc. etc..

So, the test was: Download EF4.1 and SQL Compact 4 with NuGet to allow me to create a simple data model and get some CRUD functionality up on screen in a WPF screen. It failed me.

NuGet itself seems very nice – the ability to easily download packages and their associated dependencies for a VS solution, integrated within the IDE etc. – great stuff; I will definitely use it in future.

Unfortunately, neither the package for EF4.1 nor SQL Compact 4 contains the System.Data.SqlServerCe.Entity.dll, thus as soon as you try to make a connection to a SQL Compact database with EF4.1, your application will crash:

Could not load System.Data.SqlServerCe.Entity.dll. Reinstall SQL Server Compact.

The solution is to manually install SQL Compact 4 from this link. Once done, you should be good to go.

So, a failure for NuGet insofar as the packages supplied did not contain the correct assemblies for what was required – however, NuGet in general seems very impressive and I would recommend you taking a look at it in future.




There are actually two SQL Compact packages on nuGet. One is simply called SQLServerCompact; the other is called EntityFramework.SqlServerCompact. I had downloaded the former during my attempt described above. If you download the latter package, you will find that you get the .Entity.dll and don’t need to install SQL Compact 4 separately. Not sure that it’s entirely clear though that there are two versions of SQL Compact 4 in nuGet…. one which is “compatible” with EF4 and one which isn’t…

Entity Framework Code-First almost released

I should actually qualify that by saying the first version of it is almost released. I blogged about Code First what seems like ages ago, and it’s great to see a version of it getting out there. However, this version won’t ship with several features – some of which are in the current model- and database-first scenarios, and some of which aren’t – for example: –

  • No SP support
  • No compiled queries
  • No enum support (as per the rest of EF)
  • No user-defined mapping conventions

    All are on the roadmap – but in the interests of getting it out there, they’ve been pushed back to version 2. The feedback on one of the announcement pages has been pretty negative – I can’t understand why. Yes, the work isn’t complete yet. Yes, there are features that need to be added before everyone’s happy. In fact, there’ll always be someone who doesn’t like it, but hey….

What is important IMHO is that the ADO .NET team are adopting a more community-based approach – get smaller iterations of code out there and get feedback quickly. EF3.5 was OK but not much more – EF4 went a long way to improving the state of affairs; EF4.1 adds a brand new class-based approach without any EDMX etc. – give them some time and I’m sure it’ll have all the features that we want. Otherwise, they’d never release it. Just look at Firefox 4 – a great browser, but it took so long to get out there due to (IMHO) large scope creep and poor roadmapping.

My personal wish list feature would be to see better integration between EF’s model-first approach and DB projects. Currently, once you go live with a database, you effectively have to move over to database-first modelling since EF offers no integration with DB projects. I’d love to see EF modify tables and such in a database project once you’ve modified your model.

So, EF4.1 might not do everything you want. But at least it’s out there – we can try it out with full support from MS (or as much as MS Connect offers….) and go from there.