Backing up, Deleting, Restoring Elasticsearch Indexes By Index Folder

JasonG picture JasonG · Apr 17, 2015 · Viewed 8.8k times · Source

Most of the ElasticSearch documentation discusses working with the indexes through the REST API - is there any reason I can't simply move or delete index folders from the disk?

Answer

Lee H picture Lee H · Apr 17, 2015

You can move data around on disk, to a point -

If Elasticsearch is running, it is never a good idea to move or delete the index folders, because Elasticsearch will not know what happened to the data, and you will get all kinds of FileNotFoundExceptions in the logs as well as indices that are red until you manually delete them.

If Elasticsearch is not running, you can move index folders to another node (for instance, if you were decomissioning a node permanently and needed to get the data off), however, if the delete or move the folder to a place where Elasticsearch cannot see it when the service is restarted, then Elasticsearch will be unhappy. This is because Elasticsearch writes what is known as the cluster state to disk, and in this cluster state the indices are recorded, so if ES starts up and expects to find index "foo", but you have deleted the "foo" index directory, the index will stay in a red state until it is deleted through the REST API.

Because of this, I would recommend that if you want to move or delete individual index folders from disk, that you use the REST API whenever possible, as it's possible to get ES into an unhappy state if you delete a folder that it expects to find an index in.

EDIT: I should mention that it's safe to copy (for backups) an indices folder, from the perspective of Elasticsearch, because it doesn't modify the contents of the folder. Sometimes people do this to perform backups outside of the snapshot & restore API.