SearchPhaseExecutionException[Failed to execute phase [query], all shards failed]

RayofCommand picture RayofCommand · May 6, 2015 · Viewed 62.9k times · Source

Recently our server was rebooted without correctly shutting down the Elastic Search / Kibana. After that reboot, both applications were running but no indices were getting created anymore. I checked logstash setup in debug mode and it is sending data to Elastic Search.

now all my created windows report this error:

Oops! SearchPhaseExecutionException[Failed to execute phase [query], all shards failed]

I tried restarting Elastic Search / Kibana, and cleared some indices. I searched a lot but wasn't able to troubleshoot this correctly.

Current Cluster Health Status is RED as shown in picture.

Elastic Search Cluster Health

Any help as of how to troubleshoot that is upvoted. Thank you

EDIT:

[2015-05-06 00:00:01,561][WARN ][cluster.action.shard     ] [Indech] [logstash-2015.03.16][1] sending failed shard for [logstash-2015.03.16][1], node[fdSgUPDbQB2B3NQqX7MdMQ], [P], s[INITIALIZING], indexUUID [aBcfbqnNR4-AGEdIR8dVdg], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[logstash-2015.03.16][1] failed to recover shard]; nested: ElasticsearchIllegalArgumentException[No version type match [101]]; ]]
[2015-05-06 00:00:01,561][WARN ][cluster.action.shard     ] [Indech] [logstash-2015.03.16][1] received shard failed for [logstash-2015.03.16][1], node[fdSgUPDbQB2B3NQqX7MdMQ], [P], s[INITIALIZING], indexUUID [aBcfbqnNR4-AGEdIR8dVdg], reason [Failed to start shard, message [IndexShardGatewayRecoveryException[[logstash-2015.03.16][1] failed to recover shard]; nested: ElasticsearchIllegalArgumentException[No version type match [101]]; ]]
[2015-05-06 00:00:02,591][WARN ][indices.cluster          ] [Indech] [logstash-2015.04.21][4] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [logstash-2015.04.21][4] failed to recover shard
    at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:269)
    at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.ElasticsearchIllegalArgumentException: No version type match [52]
    at org.elasticsearch.index.VersionType.fromValue(VersionType.java:307)
    at org.elasticsearch.index.translog.Translog$Create.readFrom(Translog.java:364)
    at org.elasticsearch.index.translog.TranslogStreams.readTranslogOperation(TranslogStreams.java:52)
    at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:241)

what concerns me in the logsis this:

[2015-05-06 15:13:48,059][DEBUG][action.search.type       ] All shards failed for phase: [query]



{
  "cluster_name" : "elasticsearch",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 8,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 120,
  "active_shards" : 120,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 310
}

Answer

Andrei Stefan picture Andrei Stefan · May 6, 2015

You have many corrupt translog files, which you need to delete. You can find it in data/{clustername}/nodes/0/indices/logstash-2015.04.21/4/translog and another one in data/{clustername}/nodes/0/indices/logstash-2015.03.16/1/translog. And maybe others, but this is what I can tell from the snippet you provided. Of course, will loose what is in the translog files.

If the indices don't have the index files anymore (only _state folder exists under data/{clustername}/nodes/0/indices/[index_name]) this means there is no data in that index anymore and at this point you can delete the index. You need to reindex that data, if you still need it. If you decide to delete the indices, you need to shutdown the node and delete the index folders under data/{clustername}/nodes/0/indices that are like the one you mentioned (empty, containing just the _state folder).