bash tail on a live log file, counting uniq lines with same date/time

zapp picture zapp · Dec 12, 2013 · Viewed 13.8k times · Source

I'm looking for a good way to tail on a live log file, and display number of lines with the same date/time.

Currently this is working:

 tail -F /var/logs/request.log | [cut the date-time] | uniq -c

BUT the performance is not good enough. There is a delay of more than one minute, and it output in bulks of few lines each time.

Any idea?

Answer

Floris picture Floris · Dec 13, 2013

Your problem is most likely related to buffering in your system, not anything intrinsically wrong with your line of code. I was able to create a test scenario where I could reproduce it - then make it go away. I hope it will work for you too.

Here is my test scenario. First I write a short script that writes the time to a file every 100 ms (approx) - this is my "log file" that generates enough data that uniq -c should give me an interesting output every second:

#!/bin/ksh
while :
do
  echo The time is `date` >> a.txt
  sleep 0.1
done

(Note - I had to use ksh which has the ability to do a sub-second sleep)

In another window, I type

tail -f a.txt | uniq -c

Sure enough, you get the following output appearing every second:

   9 The time is Thu Dec 12 21:01:05 EST 2013
  10 The time is Thu Dec 12 21:01:06 EST 2013
  10 The time is Thu Dec 12 21:01:07 EST 2013
   9 The time is Thu Dec 12 21:01:08 EST 2013
  10 The time is Thu Dec 12 21:01:09 EST 2013
   9 The time is Thu Dec 12 21:01:10 EST 2013
  10 The time is Thu Dec 12 21:01:11 EST 2013
  10 The time is Thu Dec 12 21:01:12 EST 2013

etc. No delays. Important to note - I did not attempt to cut out the time. Next, I did

tail -f a.txt | cut -f7 -d' ' | uniq -c

And your problem reproduced - it would "hang" for quite a while (until there was 4k of characters in the buffer, and then it would vomit it all out at once).

A bit of searching online ( https://stackoverflow.com/a/16823549/1967396 ) told me of a utility called stdbuf . At that reference, it specifically mentions almost exactly your scenario, and they provide the following workaround (paraphrasing to match my scenario above):

tail -f a.txt | stdbuf -oL cut -f7 -d' ' | uniq -c

And that would be great… except that this utility doesn't exist on my machine (Mac OS) - it is specific to GNU coreutils. This left me unable to test - although it may be a good solution for you.

Never fear - I found the following workaround, based on the socat command (which I honestly barely understand, but I adapted from the answer given at https://unix.stackexchange.com/a/25377 ).

Make a small file called tailcut.sh (this is the "long_running_command" from the link above):

#!/bin/ksh
tail -f a.txt | cut -f7 -d' '

Give it execute permissions with chmod 755 tailcut.sh . Then issue the following command:

socat EXEC:./tailcut.sh,pty,ctty STDIO | uniq -c

And hey presto - your lumpy output is lumpy no more. The socat sends the output from the script straight to the next pipe, and uniq can do its thing.