What is the difference between dismax and EdisMax?

gangatharan picture gangatharan · Nov 28, 2012 · Viewed 20.2k times · Source

I like to know what is the difference between DisMax and EDisMax..? Is there any useful reference to know about that.? Also, I would like to know what are the queries DisMax failed to produce the result for which EDisMax is able to produce the result..?

EDisMax has some Query parameter like boost Parameter,ps Parameter,The pf2 Parameter; But apart from this query parameter, how EDisMax better than DisMax; how queries are processed between these two.What factors make EDisMax do better than DisMax..

Some queries failed to give result in DisMax but EDisMax gives result for those queries.

I googled the difference between DisMax and EDisMax. I have found, the parameters have been used in EDisMax is only the difference between DisMax and EDisMax; but I am expecting something technically to explain to others in presentation.

http://ip:8983/solr/C73/select/?defType=edismax&q=ipod OR video&fl=filename, score&hl=true&hl.fl=content contentenstem filename&hl.zetaContentField=content

for above query EDisMax produces about 238 results; but DisMax produces 0 result. So what is the difference between handling this query by this two parser;What makes EDisMax to produce result.Thats what I like to know ....

Answer

Jayendra picture Jayendra · Nov 28, 2012

As Dismax had a lot of limitations, EDismax query parser was added.

Check out SOLR-1553

To start with (as in Documentation) :-

The extended dismax parser was based on the original Solr dismax parser.

  • Supports full lucene query syntax in the absence of syntax errors
  • supports "and"/"or" to mean "AND"/"OR" in lucene syntax mode
  • When there are syntax errors, improved smart partial escaping of special characters is done to prevent them... in this mode, fielded queries, +/-, and phrase queries are still supported.
  • Improved proximity boosting via word bigrams... this prevents the problem of needing 100% of the words in the document to get any boost, as well as having all of the words in a single field.
  • advanced stopword handling... stopwords are not required in the mandatory part of the query but are still used (if indexed) in the proximity boosting part. If a query consists of all stopwords (e.g. to be or not to be) then all will be required.
  • Supports the "boost" parameter.. like the dismax bf param, but multiplies the function query instead of adding it in
  • Supports pure negative nested queries... so a query like +foo (-foo) will match all documents

However, as you would a lot of associated JIRA's to improve the query parsing capability and support for more features.

Reading through the JIRA's can be really insightful :)