Index CSV to ElasticSearch in Python

bluesummers picture bluesummers · Jan 10, 2017 · Viewed 16.5k times · Source

Looking to index a CSV file to ElasticSearch, without using Logstash. I am using the elasticsearch-dsl high level library.

Given a CSV with header for example:

name,address,url
adam,hills 32,http://rockit.com
jane,valleys 23,http://popit.com

What will be the best way to index all the data by the fields? Eventually I'm looking to get each row to look like this

{
"name": "adam",
"address": "hills 32",
"url":  "http://rockit.com"
}

Answer

Honza Král picture Honza Král · Jan 11, 2017

This kind of task is easier with the lower-level elasticsearch-py library:

from elasticsearch import helpers, Elasticsearch
import csv

es = Elasticsearch()

with open('/tmp/x.csv') as f:
    reader = csv.DictReader(f)
    helpers.bulk(es, reader, index='my-index', doc_type='my-type')