Validation Failed: 1: no requests added in bulk indexing ElasticSearch

rick picture rick · Apr 21, 2016 · Viewed 23k times · Source

I have a JSON file and I need to index it on ElasticSearch server.

JSOIN file looks like this:

{
    "sku": "1",
    "vbid": "1",
    "created": "Sun, 05 Oct 2014 03:35:58 +0000",
    "updated": "Sun, 06 Mar 2016 12:44:48 +0000",
    "type": "Single",
    "downloadable-duration": "perpetual",
    "online-duration": "365 days",
    "book-format": "ePub",
    "build-status": "In Inventory",
    "description": "On 7 August 1914, a week before the Battle of Tannenburg and two weeks before the Battle of the Marne, the French army attacked the Germans at Mulhouse in Alsace. Their objective was to recapture territory which had been lost after the Franco-Prussian War of 1870-71, which made it a matter of pride for the French. However, after initial success in capturing Mulhouse, the Germans were able to reinforce more quickly, and drove them back within three days. After forty-three years of peace, this was the first test of strength between France and Germany. In 1929 Karl Deuringer wrote the official history of the battle for the Bavarian Army, an immensely detailed work of 890 pages; First World War expert and former army officer Terence Zuber has translated this study and edited it down to more accessible length, to produce the first account in English of the first major battle of the First World War.",
    "publication-date": "07/2014",
    "author": "Deuringer, Karl",
    "title": "The First Battle of the First World War: Alsace-Lorraine",
    "sort-title": "First Battle of the First World War: Alsace-Lorraine",
    "edition": "0",
    "sampleable": "false",
    "page-count": "0",
    "print-drm-text": "This title will only allow printing of 2 consecutive pages at a time.",
    "copy-drm-text": "This title will only allow copying of 2 consecutive pages at a time.",
    "kind": "book",
    "fro": "false",
    "distributable": "true",
    "subjects": {
      "subject": [
        {
          "-schema": "bisac",
          "-code": "HIS027090",
          "#text": "World War I"
        },
        {
          "-schema": "coursesmart",
          "-code": "cs.soc_sci.hist.milit_hist",
          "#text": "Social Sciences -> History -> Military History"
        }
      ]
    },   
   "pricelist": {
      "publisher-list-price": "0.0",
      "digital-list-price": "7.28"
    },
    "publisher": {
      "publisher-name": "The History Press",
      "imprint-name": "The History Press Ireland"
    },
    "aliases": {
      "eisbn-canonical": "1",
      "isbn-canonical": "1",
      "print-isbn-canonical": "9780752460864",
      "isbn13": "1",
      "isbn10": "0750951796",
      "additional-isbns": {
        "isbn": [
          {
            "-type": "print-isbn-10",
            "#text": "0752460862"
          },
          {
            "-type": "print-isbn-13",
            "#text": "97807524608"
          }
        ]
      }
    },
    "owner": {
      "company": {
        "id": "1893",
        "name": "The History Press"
      }
    },
    "distributor": {
      "company": {
        "id": "3658",
        "name": "asc"
      }
    }
  }

But when I try to indexing this JSON file using command

curl -XPOST 'http://localhost:9200/_bulk' -d @1.json

I get this error:

{"error":{"root_cause":[{"type":"action_request_validation_exception","reason":"Validation Failed: 1: no requests added;"}],"type":"action_request_validation_exception","reason":"Validation Failed: 1: no requests added;"},"status":400}

I don't know where I am making a mistake.

Answer

davide picture davide · Apr 21, 2016

The bulk API of Elasticsearch use a special syntax, which is actually made of json documents written on single lines. Take a look to the documentation.

The syntax is pretty simple. For indexing, creating and updating you need 2 single line json documents. The first lines tells the action, the second gives the document to index/create/update. To delete a document, it is only needed the action line. For example (from the documentation):

{ "index" : { "_index" : "test", "_type" : "type1", "_id" : "1" } }
{ "field1" : "value1" }
{ "create" : { "_index" : "test", "_type" : "type1", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_type" : "type1", "_index" : "index1"} }   
{ "doc" : {"field2" : "value2"} }
{ "delete" : { "_index" : "test", "_type" : "type1", "_id" : "2" } }

Don't forget to end your file with a new line. Then, to call the bulk api use the command:

curl -s -XPOST localhost:9200/_bulk --data-binary "@requests"

From the documentation:

If you’re providing text file input to curl, you must use the --data-binary flag instead of plain -d