ElasticSearch searching with hyphen inside a word

Sensini picture Sensini · Jul 8, 2015 · Viewed 14.2k times · Source

I would like to ask for a help. I want to search for a words inside the Title and Content. Here is the structure

'body' => array(
  'mappings' => array(
    'myindex' => array(
      '_source' => array(
        'enabled' => true
      ),
      'properties' => array(
        'Title' => array(
          'type'  => 'string',
          'fields'=> array(
            'raw' => array(
               'type'  => 'string',
               'index' => 'not_analyzed'
              )
            )
          ),
          'Content' => array(
            'type'  => 'string'
          ),
          'Image' => array(
             type'      => 'string',
             'analyzer'  => 'standard'
         )
       )
     )
   )
 )

And the query string looks like this, where I want so search for "15-g" inside a text like "15-game":

"query" : {
  "query_string": {
    "query": "*15-g*",
    "fields": [ "Title", "Content" ]
  }
}

Please accept my apologize if I duplicate the question but I cannot find out what's going on and why it does not return any results.

I've already had a look at:

ElasticSearch - Searching with hyphens

ElasticSearch - Searching with hyphens in name

ElasticSearch - Searching with hyphens in name

But I can't make to work that with me.

What is really interesting is that if I search for "15 - g" (15space-spaceg) it returns the result.

Thank you so much in advance!

Answer

Andrei Stefan picture Andrei Stefan · Jul 9, 2015

Add a .raw field to your Content as well and make the search on the .raw fields:

{
  "query": {
    "query_string": {
      "query": "*15-g*",
      "fields": [
        "Title.raw",
        "Content.raw"
      ]
    }
  }
}

Anywhere you have a space in the text you want to search and you want that space to match your fields, it needs to be escaped (with \). Also, anytime you have upper case letter and wildcards and you want to match like that with the .raw fields you need to set lowercase_expanded_terms to false, because by default that setting is true and it will lowercase the search string (it will search for laptop - black):

{
  "query": {
    "query_string": {
      "query": "*Laptop\\ -\\ Black*",
      "lowercase_expanded_terms": false, 
      "fields": [
        "Title.raw",
        "Content.raw"
      ]
    }
  }
}