Querying azure table storage for null values

Brosto picture Brosto · Nov 19, 2010 · Viewed 13.4k times · Source

Does anyone know the proper way to query azure table storage for a null value. From what I've read, it's possible (although there is a bug which prevents it on development storage). However, I keep getting the following error when I do so on the live cloud storage:

One of the request inputs is not valid.

This is a dumbed down version of the LINQ query that I've put together.

var query = from fooBar in fooBarSVC.CreateQuery<FooBar>("FooBars")
        where fooBar.PartitionKey == kPartitionID
            && fooBar.Code == kfooBarCode
            && fooBar.Effective_Date <= kFooBarDate.ToUniversalTime()
            && (fooBar.Termination_Date > kFooBarDate.ToUniversalTime() || fooBar.Termination_Date == null)
        select fooBar;

If I run the query without checking for null, it works fine. I know a possible solution would be to run a second query on the collection that this query brings back. I don't mind doing that if I need to, but would like to know if I can get this approach to work first.

Anyone see anything obvious I'm doing wrong?

Answer

Nathan Totten picture Nathan Totten · Nov 24, 2010

The problem is that because azure table storage does not have a schema, the null column actually doesn't exist. This is why your query is not valid. there is no such thing as a null column in table storage. You could do something like store an empty string if you really have to. Really though the fundamental issue here is that Azure table storage really is not built to be queried by any columns other than partition key and row key. Every time you make a query on one of these non-standard columns you are doing a table scan. If you start to get lots of data you are going to have a very high rate of query time outs. I would suggest setting up a manual index for these types of queries. For example, you could store the same data in the same table but with different values for the Row key. Ultimately, if your are app is not getting crazy high usage I would just use SQL Azure as it will be much more flexible for the types of queries you are doing.

Update: Azure has a great guide on table storage design that I would recommend reading. http://azure.microsoft.com/en-us/documentation/articles/storage-table-design-guide/