Faceted Search (solr) vs Good old filtering via PHP?

Adil picture Adil · Aug 31, 2011 · Viewed 14k times · Source

I am planning on setting up a filter system (refine your search) in my ecommerce stores. You can see an example here: http://www.bettymills.com/shop/product/find/Air+and+HVAC+Filters

Platforms such as PrestaShop, OpenCart and Magento have what's called a Layered Navigation.

My question is what is the difference between the Layered Navigation in platforms such as Magento or PrestaShop in comparison to using something like Solr or Lucene for faceted navigation.

Can a similar result be accomplished via just php and mysql?

A detailed explanation is much appreciated.

Answer

netcoder picture netcoder · Aug 31, 2011

Layered Navigation == Faceted Search.

They are the same thing, but Magento and al uses different wording, probably to be catchy. As far as I know, Magento supports both the Solr faceted search or the MySQL one. The main difference is the performance.

Performance is the main trade-off.

To do faceted search in MySQL requires you to join tables, while Solr indexes the document facets automatically for filtering. You can generally achieve fast response times using Solr (<100ms for a multi-facet search query) on average hardware. While MySQL will take longer for the same search, it can be optimized with indexes to achieve similar response times.

The downside to Solr is that it requires you to configure, secure and run yet another service on your server. It can also be pretty CPU and memory intensive depending on your configuration (Tomcat, jetty, etc.).

Faceted search in PHP/MySQL is possible, and not as hard as you'd think.

You need a specific database schema, but it's feasible. Here's a simple example:

product

+----+------------+
| id | name       |
+----+------------+
|  1 | blue paint |
|  2 | red paint  |
+----+------------+

classification

+----+----------+
| id | name     |
+----+----------+
|  1 | color    |
|  2 | material |
|  3 | dept     |
+----+----------+

product_classification

+------------+-------------------+-------+
| product_id | classification_id | value |
+------------+-------------------+-------+
|          1 |                 1 | blue  |
|          1 |                 2 | latex |
|          1 |                 3 | paint |
|          1 |                 3 | home  |
|          2 |                 1 | red   |
|          2 |                 2 | latex |
|          2 |                 3 | paint |
|          2 |                 3 | home  |
+------------+-------------------+-------+

So, say someones search for paint, you'd do something like:

SELECT p.* FROM product p WHERE name LIKE '%paint%';

This would return both entries from the product table.

Once your search has executed, you can fetch the associated facets (filters) of your result using a query like this one:

SELECT c.id, c.name, pc.value FROM product p
   LEFT JOIN product_classification pc ON pc.product_id = p.id
   LEFT JOIN classification c ON c.id = pc.classification_id
WHERE p.name LIKE '%paint%'
GROUP BY c.id, pc.value
ORDER BY c.id;

This'll give you something like:

+------+----------+-------+
| id   | name     | value |
+------+----------+-------+
|    1 | color    | blue  |
|    1 | color    | red   |
|    2 | material | latex |
|    3 | dept     | home  |
|    3 | dept     | paint |
+------+----------+-------+

So, in your result set, you know that there are products whose color are blue and red, that the only material it's made from is latex, and that it can be found in departments home and paint.

Once a user select a facet, just modify the original search query:

SELECT p.* FROM product p
   LEFT JOIN product_classification pc ON pc.product_id = p.id
WHERE 
   p.name LIKE '%paint%' AND (
      (pc.classification_id = 1 AND pc.value = 'blue') OR
      (pc.classification_id = 3 AND pc.value = 'home')
   )
GROUP BY p.id
HAVING COUNT(p.id) = 2;

So, here the user is searching for keyword paint, and includes two facets: facet blue for color, and home for department. This'll give you:

+----+------------+
| id | name       |
+----+------------+
|  1 | blue paint |
+----+------------+

So, in conclusion. Although it's available out-of-the-box in Solr, it's possible to implement it in SQL fairly easily.