I'm using the ElasticSearch (2.4) and the official Python client to perform simple queries. My code:
from elasticsearch import Elasticsearch
es_client = Elasticsearch("localhost:9200")
index = "indexName"
doc_type = "docType"
def search(query, search_size):
body = {
"fields": ["title"],
"size": search_size,
"query": {
"query_string": {
"fields": ["file.content"],
"query": query
}
}
}
response = es_client.search(index=index, doc_type=doc_type, body=body)
return response["hits"]["hits"]
search("python", 10) # Works fine.
The problem is when my query contains unbalanced parenthesis or brackets. For example with search("python {programming", 10)
ES throws:
elasticsearch.exceptions.RequestError: TransportError(400, u'search_phase_execution_exception', u'Failed to parse query [python {programming}]')
Is that the expected behavior of ES? Doesn't it use a tokenizer to remove all those characters?
Note: This happens to me using Java too.
I know I am late enough but I am posting here and I hope it'll help others. As we know from the Elasticsearch documentation here ES has some reserved characters.
The reserved characters are: + - = && || > < ! ( ) { } [ ] ^ " ~ * ? : \ /
So, now you've two possible solutions to fix it. These are working perfectly for me when I encountered special character issue
Solution 1: Wrap your special characters with \\
"query": {
"bool": {
"must": [
{
"match": {
"country_code.keyword": "IT"
}
},
{
"query_string": {
"default_field": "display",
"query": "Magomadas \\(OR\\), Italy"
}
}
]
}
}
Solution 2: Use simple_query_string
with no change on your query
but it doesn't support default_field
, so you can use fields
instead.
"query": {
"bool": {
"must": [
{
"match": {
"country_code.keyword": "IT"
}
},
{
"simple_query_string": {
"fields": ["display"],
"query": "Magomadas (OR), Italy"
}
}
]
}
}