facet dynamic fields with apache solr

krinn picture krinn · Sep 22, 2011 · Viewed 7.5k times · Source

I have defined dynamic field in ApacheSolr:

I use it to store products features like: color_feature, diameter_feature, material_feature and so on. Number of those fields are not constant becouse products are changing.

Is it possible to get facet result for all those dynamic fields with the same query or do I need to write always all fields in a query like ... facet.field=color_feature&facet.field=diameter_feature&facet.field=material_feature&facet.field=...

Answer

Nicholas Piasecki picture Nicholas Piasecki · Jan 25, 2013

I was in a similar situation when working on an e-commerce platform. Each item had static fields (Price, Name, Category) that easily mapped to SOLR's schema.xml, but each item could also have a dynamic amount of variations.

For example, a t-shirt in the store could have Color (Black, White, Red, etc.) and Size (Small, Medium, etc.) attributes, whereas a candle in the same store could have a Scent (Pumpkin, Vanilla, etc.) variation. Essentially, this is an entity-attribute-value (EAV) relational database design used to describe some features of the product.

Since the schema.xml file in SOLR is flat from the perspective of faceting, I worked around it by munging the variations into a single multi-valued field ...

<field
  name="variation"
  type="string"
  indexed="true"
  stored="true"
  required="false"
  multiValued="true" />

... shoving data from the database into these fields as Color|Black, Size|Small, and Scent|Pumpkin ...

  <doc>
    <field name="id">ITEM-J-WHITE-M</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Medium</field>
  </doc>
  <doc>
    <field name="id">ITEM-J-WHITE-L</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Large</field>
  </doc>
  <doc>
    <field name="id">ITEM-J-WHITE-XL</field>
    <field name="itemgroup.identity">2</field>
    <field name="name">Original Jock</field>
    <field name="type">ITEM</field>
    <field name="variation">Color|White</field>
    <field name="variation">Size|Extra Large</field>
  </doc>

... so that when I tell SOLR to facet, then I get results that look like ...

<lst name="facet_counts">
  <lst name="facet_queries"/>
  <lst name="facet_fields">
    <lst name="variation">
      <int name="Color|White">2</int>
      <int name="Size|Extra Large">2</int>
      <int name="Size|Large">2</int>
      <int name="Size|Medium">2</int>
      <int name="Size|Small">2</int>
      <int name="Color|Black">1</int>
    </lst>
  </lst>
  <lst name="facet_dates"/>
  <lst name="facet_ranges"/>
</lst>

... so that my code that parses these results to display to the user can just split on my | delimiter (assuming that neither my keys nor values will have a | in them) and then group by the keys ...

Color
    White (2)
    Black (1)
Size
    Extra Large (2)
    Large (2)
    Medium (2)
    Small (2)

... which is good enough for government work.

One disadvantage of doing it this way is that you'll lose the ability to do range facets on this EAV data, but in my case, that didn't apply (the Price field applying to all items and thus being defined in schema.xml so that it can be faceted in the usual way).

Hope this helps someone!