How do I use Nagios to monitor a log file

Kenoyer130 picture Kenoyer130 · Mar 3, 2010 · Viewed 74.8k times · Source

We are using Nagios to monitor our network with great success. However, we have a syslog for critical application errors and while I set up check_log, it doesn't seem to work as well as monitering a device.

The issues are:

  • It only shows the last entry
  • There doesn't seem to be a way to acknowledge the critical error and return the monitor to a good state

Is nagios the wrong tool, or are we just not setting up the service monitering right?

Here are my entries

# log file
define command{
        command_name    check_log
        command_line    $USER1$/check_log -F /var/log/applications/appcrit.log -O /tmp/appcrit.log -q ?
}


# Define the log monitering service
define service{
        name                            logfile-check           ;
        use                             generic-service         ;
        check_period                    24x7                    ;
        max_check_attempts              1                       ;
        normal_check_interval           5                       ;
        retry_check_interval            1                       ;
        contact_groups                  admins                  ;
        notification_options            w,u,c,r                 ;
        notification_period             24x7                    ;
        register                        0                       ;
        }

define service{
        use                             logfile-check
        host_name                       localhost
        service_description             CritLogFile
        check_command                   check_log
}

Answer

Archie picture Archie · Jan 25, 2011

For monitoring logs with Nagios, typically the log checker will return a warning only for newly discovered error messages each time it is invoked (so it must retain some state in order to know to ignore them on subsequent runs). Therefore I usually set:

max_check_attempts              1
is_volatile                     1

This causes Nagios to send out the alert immeidately, but only once, and then go back to normal.

My favorite log checker is logwarn, but I'm biased because I wrote it myself after not finding any existing ones that I liked. The logwarn package includes a Nagios plugin.