I'd like to know if there is any tip to make grep
as fast as possible. I have a rather large base of text files to search in the quickest possible way. I've made them all lowercase, so that I could get rid of -i
option. This makes the search much faster.
Also, I've found out that -F
and -P
modes are quicker than the default one. I use the former when the search string is not a regular expression (just plain text), the latter if regex is involved.
Does anyone have any experience in speeding up grep
? Maybe compile it from scratch with some particular flag (I'm on Linux CentOS), organize the files in a certain fashion or maybe make the search parallel in some way?
Try with GNU parallel, which includes an example of how to use it with grep
:
grep -r
greps recursively through directories. On multicore CPUs GNUparallel
can often speed this up.find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
This will run 1.5 job per core, and give 1000 arguments to
grep
.
For big files, it can split it the input in several chunks with the --pipe
and --block
arguments:
parallel --pipe --block 2M grep foo < bigfile
You could also run it on several different machines through SSH (ssh-agent needed to avoid passwords):
parallel --pipe --sshlogin server.example.com,server2.example.net grep foo < bigfile