CQRS event store aggregate vs projection

webish picture webish · Jan 30, 2017 · Viewed 7.5k times · Source

In a CQRS event store, does an "aggregate" contain a summarized view of the events or simply a reference to the boundary of those events? (group id)

A projection is a view or representation of events so in the case of an aggregate representing a boundary that would make sense to me whereas if the aggregate contained the current summarized state I'd be confused about duplication between the two.

Answer

VoiceOfUnreason picture VoiceOfUnreason · Jan 30, 2017

In a CQRS event store, does an "aggregate" contain a summarized view of the events or simply a reference to the boundary of those events? (group id)

Aggregates don't exist in the event store

  • Events live in the event store
  • Aggregates live in the write model (the C of CQRS)

Aggregate, in this case, still has the same basic meaning that it had in the "Blue Book"; it's the term for a boundary around one or more entities that are immediately consistent with each other. The responsibility of the aggregate is to ensure that writes (commands) to the book of record respect the business invariant.

It's typically convenient, in an event store, to organize the events into "streams"; if you imagine a RDBMS schema, the stream id will just be some identifier that says "these events are all part of the same history."

It will usually be the case that one aggregate -> one stream, but usually isn't always; there are some exceptional cases you may need to handle when you change your model. Greg Young is covering some of these in his new eBook on event versioning.

So it's possible that the same data structure might exist in the aggregate and query side store (duplicated view used for different purposes).

Yes, and no. It's absolutely the case that the data structures used when validating a write match those used to support a query. But the storage doesn't usually match. Put another way, aggregates don't get stored (the state of the aggregate does); whereas it is fairly common that the query view gets cached (again, not the data structure itself, but a representation that can be used to repopulate the data structure without necessarily needing to replay all of the events).

Any chance you have an example of aggregate state data structure (rdbms)? Every example I've found is trimmed down to a few columns with something like include id, source_id, version making it difficult to visualize what the scope of an aggregate is

A common example would be that of a trading book (an aggregate responsible for matching "buy" and "sell" orders).

In a traditional RDBMS store, that would probably look like a row in a books table, with a unique id for the book, information about what item that book is tracking, date information about when that book is active, and so on. In addition, there's likely to be some sort of orders table, with uniq ids, a trading book id, order type, transaction numbers, prices and volumes (in other words, all of the information the aggregate needs to know to satisfy its invariant).

In a document store, you'd see all of that information in a single document -- perhaps a json document with the information about the root object, and two lists of order objects (one for buys, one for sells).

In an event store, you'd see the individual OrderPlaced, TradeOccurred, OrderCancelled.

it seems that the aggregate is computed using the entire set of events unless it gets large enough to warrant a snapshot.

Yes, that's exactly right. If you are familiar with a "fold function", then event sourcing is just a fold from some common initial state. When a snapshot is available, we'll fold from that state (with a corresponding reduction in the number of events that get folded in)

In an event sourced environment with "snapshots", you might see a combination of the event store and the document store (where the document would include additional meta information indicating where in the event stream it had been assembled).