I am using elasticsearch-py for elasticsearch operation.
I am trying for elasticsearch.helpers.bulk
to create or update multiple records.
from elasticsearch import Elasticsearch
from elasticsearch import helpers
es = Elasticsearch()
data = [
{
"_index": "customer",
"_type": "external",
"_op_type": "create",
"_id": 3,
"doc" : {"name": "test"}
},
{
"_index": "customer",
"_type": "external",
"_op_type": "create",
"_id": 4,
"doc" : {"name": "test"}
},
{
"_index": "customer",
"_type": "external",
"_op_type": "create",
"_id": 5,
"doc" : {"name": "test"}
},
{
"_index": "customer",
"_type": "external",
"_op_type": "create",
"_id": 6,
"doc" : {"name": "test"}
},
]
print helpers.bulk(es, data)
Is there any way to perform this operation?
Now we can give only _op_type
as create
or update
. If we give update
and record is not exist, then it will raise error.
Traceback (most recent call last):
File "/tmp/test.py", line 37, in <module>
print helpers.bulk(es, data)
File "/local/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 182, in bulk
for ok, item in streaming_bulk(client, actions, **kwargs):
File "/local/lib/python2.7/site-packages/elasticsearch/helpers/__init__.py", line 155, in streaming_bulk
raise BulkIndexError('%i document(s) failed to index.' % len(errors), errors)
elasticsearch.helpers.BulkIndexError: ('4 document(s) failed to index.', [{u'update': {u'status': 404, u'_type': u'external', u'_id': u'3', u'error': u'DocumentMissingException[[customer][-1] [external][3]: document missing]', u'_index': u'customer'}}, {u'update': {u'status': 404, u'_type': u'external', u'_id': u'4', u'error': u'DocumentMissingException[[customer][-1] [external][4]: document missing]', u'_index': u'customer'}}, {u'update': {u'status': 404, u'_type': u'external', u'_id': u'5', u'error': u'DocumentMissingException[[customer][-1] [external][5]: document missing]', u'_index': u'customer'}}, {u'update': {u'status': 404, u'_type': u'external', u'_id': u'6', u'error': u'DocumentMissingException[[customer][-1] [external][6]: document missing]', u'_index': u'customer'}}])
According to the _bulk
endpoint documentation, you can and should use the index
action for this, provided your documents always have the same identifiers.
create
is useful when creating documents the first time, and update
is more meant for doing partial and/or scripted updates.
You can also not specify any _op_type
at all and index
will be taken by default.