Sync postgreSql data with ElasticSearch

Khanetor picture Khanetor · Mar 5, 2016 · Viewed 12.7k times · Source

Ultimately I want to have a scalable search solution for the data in PostgreSql. My finding points me towards using Logstash to ship write events from Postgres to ElasticSearch, however I have not found a usable solution. The soluions I have found involve using jdbc-input to query all data from Postgres on an interval, and the delete events are not captured.

I think this is a common use case so I hope you guys could share with me your experience, or give me some pointers to proceed.

Answer

Val picture Val · Mar 6, 2016

If you need to also be notified on DELETEs and delete the respective record in Elasticsearch, it is true that the Logstash jdbc input will not help. You'd have to use a solution working around the binlog as suggested here

However, if you still want to use the Logstash jdbc input, what you could do is simply soft-delete records in PostgreSQL, i.e. create a new BOOLEAN column in order to mark your records as deleted. The same flag would then exist in Elasticsearch and you can exclude them from your searches with a simple term query on the deleted field.

Whenever you need to perform some cleanup, you can delete all records flagged deleted in both PostgreSQL and Elasticsearch.