Should I use EAV model?

Cobby picture Cobby · Nov 1, 2010 · Viewed 12.9k times · Source

I'm am designing my database/domain for an eCommerce application and I'm having a hard time figuring out how to store products.

The website will sell a wide range of products, pens, thongs, tattoos, umbrellas, everything. Each of these product will share a few common attributes, height, width, length, weight, etc but some products have special data. For example, pens have different ink colors and tips/lids and brochures can have different types of folds. So far I have thought up some 20+ extra attributes, but these attributes may only apply to 1% of products on the website.

So I am wondering if it is appropriate to implement a EAV model to handle the extra data. Keeping in mind that when customers are viewing the site in the frontend, there will be a filtering sidebar like on eBay and carsales.com.au. (So keeping in mind there will be a fair bit of querying)

I don't think it's practical to implement Class Table inheritance as the system needs to remain flexible. This is because, down the track we may have more attributes in the future with new types of products.

The other thing I have considered is using a NoSQL database (probably MongoDB) however I have little experience with these types of databases, will it even solve my problem?

Review of options:

  1. Single products entity with lots of columns
  2. Separate attributes entity (EAV)
  3. Switch to schema-less persistence

I'm in the process of building a prototype with an attributes entity to see how flexible it is, and testing the performance and how out of control the querying gets.

EDIT: I am, of course, open to any other solutions.

Answer

Jonathan Day picture Jonathan Day · Nov 1, 2010

Great question, but of course, there is no "one true way". As per @BenV, Magento does use the EAV model. My experience with it has been overwhelmingly positive, however it does trip up other users. Some considerations:

1. Performance. EAV requires complex, multi-table joins to populate your object with the relevant attributes. That does incur a performance hit. However, that can be mitigated through careful caching (at all levels through the stack, including query caching) and the selective use of denormalization. Magento does allow administrators to select a denormalized model for categories and products where the number of SKUs warrants it (generally in the thousands). That in turn requires Observers that trigger re-indexing (always good!) and updates to the "flat" denormalized tables when product data changes. That can also be scheduled or manually triggered with a prompt to the administrator.

2. 3rd Party User Complexity If you ever plan to make this application available to other users, many will find EAV too complex and you'll end up dealing with a lot of bleating and uninformed abuse on the user forums (ref Magento!!).

3. Future extensibility and plugin architecture. There is no doubt that the EAV model really comes into it's own when extensibility is a factor. It is very simple to add new attributes into the model while minimizing the risk of breaking existing ORM and controller code.

4. Changes in datatype EAV does make it a little harder to alter attribute datatypes. If your initial design calls for a particular attribute datatype that changes in future (say int to varchar), it means that you will have to migrate all the records for that attribute to the corresponding table that matches the new datatype. Of course, purists would suggest that you get the design right first time, but reality does intrude sometimes!

5. Manual product imports One thing that EAV makes almost impossible is importing products (or other entities) into the database using SQL and/or phpMyAdmin-style CSV/XML. You'll need to write an Importer module that accepts the structured data and passes it through the application's Model layer to persist it to the database. That does add to your complexity.