Logstash File input: sincedb_path

John C picture John C · Apr 11, 2015 · Viewed 17.7k times · Source

Upon restarting Logstash, at times observed that Logstash duplicates the log events. Was wondering as to what would be the right way to apply start_position, sincedb_path, sincedb_write_interval configuration options.

  • What happens when there are multiple files in the same location as in my example below /home/tom/testData/*.log
  • What happens when the file rotation occurs like for example the XXX.log file is renamed to XXX-<date>.log and a new XXX.log file is created. In this case name doesn't change, but the inode changes.

Would highly appreciate if anyone can throw some light on this.

input {
           file {
             path => "/home/tom/testData/*.log"
             type => "log"
             start_position => "beginning"
             sincedb_path => "/persistent/loc"        
             sincedb_write_interval => 10
               }
       }

Answer

Alain Collins picture Alain Collins · May 4, 2015

start_position (beginning or end) is only used for files that have not yet been seen by logstash. The only reason to use 'beginning' is when you're trying to load older files.

sincedb_path just needs to be a directory where logstash has write permission for the registry.

sincedb_write_interval defines how often logstash should write the sincedb registry. A larger value puts you at risk in logstash were to crash.

When you have multiple files that match your glob, logstash tracks them separately by having multiple entries in the registry.

The registry contains the inode number, so logstash knows what to do in that type of rotation.