I'm getting this error, in my ES log I'm using three nodes.
Caused by: java.lang.ArrayIndexOutOfBoundsException
[2014-09-08 13:53:56,167][WARN ][cluster.action.shard ] [Dancing Destroyer] [events][3] sending failed shard for [events][3], node[RDZy21y7SRep7n6oWT8ogg], [P], s[INITIALIZING], indexUUID [gzj1aHTnQX6XDc0SxkvxDQ], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[events][3] failed recovery]; nested: FlushFailedEngineException[[events][3] Flush failed]; nested: ArrayIndexOutOfBoundsException; ]]
[2014-09-08 13:53:56,357][WARN ][indices.cluster ] [Dancing Destroyer] [events][3] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [events][3] failed recovery
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:185)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.index.engine.FlushFailedEngineException: [events][3] Flush failed
at org.elasticsearch.index.engine.internal.InternalEngine.flush(InternalEngine.java:805)
at org.elasticsearch.index.shard.service.InternalIndexShard.performRecoveryFinalization(InternalIndexShard.java:726)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:249)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
... 3 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
[2014-09-08 13:53:56,381][WARN ][cluster.action.shard ] [Dancing Destroyer] [events][3] sending failed shard for [events][3], node[RDZy21y7SRep7n6oWT8ogg], [P], s[INITIALIZING], indexUUID [gzj1aHTnQX6XDc0SxkvxDQ], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[events][3] failed recovery]; nested: FlushFailedEngineException[[events][3] Flush failed]; nested: ArrayIndexOutOfBoundsException; ]]
This means that the status of ES is red and I'm missing nearly 10 million documents. What does this error mean, so that I'd might be able to recover?
It seems that I had a messed up shard, that needed fixing. It's a Lucene thing, where you tell Lucene to fix the shard.
For Ubuntu, the solution was to go to the /usr/share/elasticsearch/lib
directory and find out which Lucene core version is was running (running ls
will show you a file named something like lucene-core-4.8.1.jar) and then type:
java -cp lucene-core-x.x.x.jar -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex /var/lib/elasticsearch/<clustername>/nodes/0/indices/<index>/<shard>/index/ -fix
Replace the x.x.x with the Lucene core version, with your clustername, index with the name of the index and of course with the failing shard number.
This can potentially give a loss of documents
But it fixed our issue.