I'm wondering what everyone is using for logging, log management and log aggregation on their systems.
I am working in a company which uses .NET for all it's applications and all systems are Windows based. Currently each application looks after its own logging and notifications of failures (e.g. if app A fails it will send out its own 'call for help' to an admin).
While this current practice works its a bit hacky and hard to manage. I've been trying to find some options for making this work better and I've come up with the following:
Essentially what we are after is something which can pull log entries all together and allow for some analytics to be run across them, plus use a kind of event based system to, for example, send out a warning email when there have been 30+ warning level logs for an application in the last x
minutes.
So is there anything I've missed, or something someone else can suggest?
If you can, I'd recommend writing to the EventLog and creating rules in SCOM to monitor. We use this extensively and it works well, even to a point of putting together pieces of code which monitor certain elements of our apps and writing values to the event log, where SCOM parses for the errors, and graphs those, plus informational errors, into reports showing stats over a given time.
I am however quite keen on rewriting some that into WMI, and having SCOM poll the WMI service for those same counters, as writing queue lengths to event log every 15 minutes seems a little wasteful ;)