Create a new index per day for Elasticsearch in Logstash configuration

elixir picture elixir · Nov 19, 2015 · Viewed 11.2k times · Source

I intend to have an ELK stack setup where daily JSON inputs get stored in log files created, one for each date. My logstash shall listen to the input via these logs and store it to Elasticsearch at an index corresponding to the date of the log file entry.

My logstash-output.conf goes something like:

output {
  elasticsearch { 
    host => localhost
    cluster => "elasticsearch_prod"
    index => "test"
  }
}

Thus, as for now, all the inputs to logstash get stored at index test of elasticsearch. What I want is that an entry to logstash occurring on say, 2015.11.19, which gets stored in logfile named logstash-2015.11.19.log, must be correspondingly stored at an index test-2015.11.19.

How should I edit my logstash configuration file to enable this ?

Answer

pandaadb picture pandaadb · Nov 19, 2015

Answer because the comment can't be formatted and it looks awful.

Your filename ( I assume you use a file input ) is stored in your path variable as such:

file {
            path => "/logs/**/*my_log_file*.log"
            }
            type => "myType"
    }

This variable is accessible throughout your whole configuration, so what you can do is use a regex filter to parse your date out of the path, for example using grok, you could do something like that (look out: Pseudocode)

if [type] == "myType" {
   grok {
      match => {
          "path" => "%{MY_DATE_PATTERN:myTimeStampVar}"
      }
   }
}

With this you now have your variable in "myTimeStampVar" and you can use it in your output:

elasticsearch {
            host => "127.0.0.1"
            cluster => "logstash"
            index => "events-%{myTimeStampVar}"
        }

Having said all this, I am not quite sure why you need this? I think it is better to have ES do the job for you. It will know the timestamp of your log and index it accordingly so you have easy access to it. However, the setup above should work for you, I used a very similar approach to parse out a client name and create sub-indexes on a per-client bases, for example: myIndex-%{client}-%{+YYYY.MM.dd}

Hope this helps,

Artur

Edit: I did some digging because I suspect that you are worried your logs get put in the wrong index because they are parsed at the wrong time? If this is correct, the solution is not to parse the index out of the log file, but to parse the timestamp out of each log.

I assume each log line for you has a timestamp. Logstash will create an @timestamp field which is the current date. So this would be not equal to the index. However, the correct way to solve this, is to mutate the @timestamp field and instead use the timestamp in your log line (the parsed one). That way logstash will have the correct index and put it there.