In the words of Professor Farnsworth – Good news everybody! I’ve finally gotten around to looking at adding some basic Azure Table Storage support to the Azure Type Provider.
Why Table Storage?
There are some difficulties with interacting with Azure Table Storage through the native .NET API, some of which impacts how useful (or not) the Type Provider can be, and some of which the Type Provider can help with: –
- The basic API gives you back an IQueryable, but you can only use Where, Take and First. Any other calls will give a runtime exception
- You can write arbitrary queries against a table with the above restriction, but this will invoke be a full table scan
- The quickest way of getting an entity is by the Partition and Entity keys, otherwise you’ll effectively initiate a full (or at best, a partial) table scan
- You can’t get the number of rows in a table without iterating through every row
- You can’t get a complete list of partitions in a table without iterating through every row
- There’s no fixed schema. You can create your own types, but these need to inherit from Table Entity. Alternatively, you can use the DynamicTableEntity to give you key/value pair access to every row; however, accessing values of an entity is a pain as you must pick a specific “getter” e.g. ValueAsBoolean or ValueAsString.
So, how does the Type Provider help you?
Well, first, you’ll automatically get back the list of tables in your storage account, for free. On dotting to a table, the provider will automatically infer the schema based upon the first x number of rows (currently I’ve set this to 20 rows) and will automatically generate the entity type.
How do we do this? Well, a table collection doesn’t have a schema that all rows must conform to, but what you do get on each cell of each entity returned is metadata including the type which can be mapped to regular .NET types; this is made easier when using the DynamicTableEntity. The generated properties in the Type Provider will use the EDM data from the row to get the data back as the correct type e.g. String, Int32 etc. etc.. and will collate different entities in the same table as a single merged entity which is the sum of both shapes.
Once this is done, you can pull back all the rows from a specific table partition into memory and then query it to your hearts content. Here’s a little sample to get you started – imagine a table as follows: –
Then with the Azure Type Provider you can do as follows: –
- The good: player is strongly typed, down to the fact that the Cost property is a float option (not a string or object).
- The ugly: You have to explicitly supply the Partition key as plain text. There’s no easy way to get a complete list of all partitions, although I am hoping to at least suggest some partition keys based on e.g. first 100 rows.
What doesn’t it do (yet)?
- You currently can’t write arbitrary queries to execute on the server. You can pull back all the entities for a particular partition key, but that’s it, nor can you specify a limit on how many entities to bring back. I want to look at ways that you can create query expressions over these provided types, or at least ways you can create “weak” queries (off of the standard CreateQuery() call) and then pipe that into the provider
- All properties of all entities are option types. This is not so different from the real underlying Table Storage fields in a Dynamic Table Entity, which are returned as nullables for all value types (the only EDM reference type is String), and is in part because there’s no way to easily know whether any column is optional or not, but I would like to give the option for a user to say that they want e.g. all fields to not be option types and to e.g. return default(T) or throw an exception instead
- You can’t search an individual entity by Entity Key (yet)
- You can’t download an entire table as a CSV yet – but you will be able to shortly
- No write support
- No async support (yet)