Repository Pattern Best Practice

Jason N. Gaylord picture Jason N. Gaylord · Aug 28, 2009 · Viewed 8.3k times · Source

So I'm implementing the repository pattern in an application and came across two "issues" in my understanding of the pattern:

  1. Querying - I've read responses that IQueryable should not be used when using repositories. However, it's obvious that you'd want to so that you are not returning a complete List of objects each time you call a method. Should it be implemented? If I have an IEnumerable method called List, what's the general "best practice" for an IQueryable? What parameters should/shouldn't it have?

  2. Scalar values - What's the best way (using the Repository pattern) to return a single, scalar value without having to return the entire record? From a performance standpoint, wouldn't it be more efficient to return just a single scalar value over an entire row?

Answer

pfries picture pfries · Aug 28, 2009

Strictly speaking, a Repository offers collection semantics for getting/putting domain objects. It provides an abstraction around your materialization implementation (ORM, hand-rolled, mock) so that consumers of the domain objects are decoupled from those details. In practice, a Repository usually abstracts access to entities, i.e., domain objects with identity, and usually a persistent life-cycle (in the DDD flavor, a Repository provides access to Aggregate Roots).

A minimal interface for a repository is as follows:

void Add(T entity);
void Remove(T entity);
T GetById(object id);
IEnumerable<T> Find(Specification spec);

Although you'll see naming differences and the addition of Save/SaveOrUpdate semantics, the above is the 'pure' idea. You get the ICollection Add/Remove members plus some finders. If you don't use IQueryable, you'll also see finder methods on the repository like:

FindCustomersHavingOrders();
FindCustomersHavingPremiumStatus();

There are two related problems with using IQueryable in this context. The first is the potential to leak implementation details to the client in the form of the domain object's relationships, i.e., violations of the Law of Demeter. The second is that the repository acquires finding responsibilities that might not belong to the domain object repository proper, e.g., finding projections that are less about the requested domain object than the related data.

Additionally, using IQueryable 'breaks' the pattern: A Repository with IQueryable may or may not provide access to 'domain objects'. IQueryable gives the client a lot of options about what will be materialized when the query is finally executed. This is the main thrust of the debate about using IQueryable.

Regarding scalar values, you shouldn't be using a repository to return scalar values. If you need a scalar, you would typically get this from the entity itself. If this sounds inefficient, it is, but you might not notice, depending on your load characteristics/requirements. In cases where you need alternate views of a domain object, because of performance reasons or because you need to merge data from many domain objects, you have two options.

1) Use the entity's repository to find the specified entities and project/map to a flattened view.

2) Create a finder interface dedicated to returning a new domain type that encapsulates the flattened view you need. This wouldn't be a Repository because there would be no Collection semantics, but it might use existing repositories under the covers.

One thing to consider if you use a 'pure' Repository to access persisted entities is that you compromise some of the benefits of an ORM. In a 'pure' implementation, the client can't provide context for how the domain object will be used, so you can't tell the repository: 'hey, I'm just going to change the customer.Name property, so don't bother getting those eager-loaded references.' On the flip side, the question is whether a client should know about that stuff. It's a double-edged sword.

As far as using IQueryable, most people seem to be comfortable with 'breaking' the pattern to get the benefits of dynamic query composition, especially for client responsibilities like paging/sorting. In which case, you might have:

Add(T entity);
Remove(T entity);
T GetById(object id);
IQueryable<T> Find();

and you can then do away with all those custom Finder methods, which really clutter the Repository as your query requirements grow.