How to get Solr Suggester to return spelling suggestions as well

newbie picture newbie · Jun 15, 2012 · Viewed 9.4k times · Source

I'm currently integrating Apache Solr searching into my platform and using the Suggester functionality for autocompletion. However, the Suggester module does not return spelling suggestions as well, so for example if I search for:

shi

The suggester module returns among others the following:

shirt
shirts

However, if I search for:

shrt

No suggestions are returned. What I'd like to know is:

a) Is it incorrect configuration of the Suggester module that has resulted in this? b) Is the Suggester module built in such a way that it does not return spelling suggestions? c) How can I get the Suggester module to return spelling suggestions as well without having to make a second request for spelling correction suggestions?

I have read the Solr documentation but cannot seem to make a headway with this.

Answer

Nitin Tripathi picture Nitin Tripathi · Jun 15, 2012

You need to configure a spell check component to generate alternate spelling options as described at https://lucene.apache.org/solr/guide/8_1/spell-checking.html

The task consists of following steps:

First, update the schema.xml with a spellcheck field. This often means creating a new field and copying multiple fields to a single spellcheck field:

<field name="spellcheck" type="text_general" 
   indexed="true" 
   stored="false" 
   multiValued="true"/>

<copyField source="id" dest="spellcheck"/>
<copyField source="name" dest="spellcheck"/>
<copyField source="description" dest="spellcheck"/>
<copyField source="longdescription" dest="spellcheck"/>
<copyField source="category" dest="spellcheck"/>
<copyField source="source" dest="spellcheck"/>
<copyField source="merchant" dest="spellcheck"/>
<copyField source="contact" dest="spellcheck"/>

In solrconfig.xml update your request handler and create a solr.SpellCheckComponent and add it to your search handler.

    <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
      <lst name="spellchecker">
        <!-- decide between dictionary based vs index based spelling suggestions, 
        in most cases it makes sense to use index based spell checker
        as it only generates terms which are 
        actually present in your search corpus -->
        <str name="classname">solr.IndexBasedSpellChecker</str>
        <!-- field to use -->
        <str name="field">spellcheck</str>
        <!-- buildOnCommit|buildOnOptimize -->
        <str name="buildOnCommit">true</str>
        <!-- $solr.solr.home/data/spellchecker-->
        <str name="spellcheckIndexDir">./spellchecker</str>
        <str name="accuracy">0.7</str>
        <float name="thresholdTokenFrequency">.0001</float>
      </lst>
    </searchComponent>

    <requestHandler name="/select" class="solr.SearchHandler">
      <lst name="defaults">
        <str name="echoParams">explicit</str>
        <int name="rows">10</int>
        <str name="df">defaultSearchField</str>
        <!-- spell check component configuration -->
        <str name="spellcheck">true</str>
        <str name="spellcheck.count">5</str>
        <str name="spellcheck.collate">true</str>
        <str name="spellcheck.maxCollationTries">5</str>
      </lst>
      <!-- add spell check processing after 
        the default search component. This is 
        the search component name. -->
      <arr name="last-components">
        <str>spellcheck</str>
      </arr>
    </requestHandler>
  • Reindex the corpus

  • Test suggestions are working. For example,

http://localhost:8983/solr/select/?q=coachin

{
  "responseHeader": {
    "status": 0,
    "QTime": 12,
    "params": {
      "indent": "true",
      "q": "coachin"
    }
  },
  "response": {
    "numFound": 0,
    "start": 0,
    "docs": []
  },
  "spellcheck": {
    "suggestions": [
      "coachin", {
        "numFound": 1,
        "startOffset": 0,
        "endOffset": 7,
        "suggestion": ["cochin"]
      }
    ]
  }
}