How to merge sorted files without using a temporary file?

Matei David picture Matei David · Jul 6, 2011 · Viewed 8.3k times · Source

I'm trying to merge many sorted files in a UNIX/Linux script with sort -m, and I noticed that sort first writes the result to a temporary file, then copies it to destination. My understanding of -m was that it assumes the files are sorted, so using a temporary file is completely unnecessary, and it wastes both hard disk space and CPU cycles (I'm using sort in a pipeline which gets stuck waiting for sort to output anything.) Is there a way to tell sort to not use temporary files when merging sorted files? Or a better version which doesn't?

The exact CL looks like:

$ sort -m -s -t '_' -k 1,1n -k 2,2n <(gunzip <file_1) [...] <(gunzip <file_n) | gzip >output

I'm using sort from GNU coreutils 5.97.

Answer

Marcin picture Marcin · Jul 6, 2011

Check out these options from man sort, they might let you minimize the amount of space needed for merging.

--batch-size=NMERGE  

merge at most NMERGE inputs at once; for more use temp files

--compress-program=PROG 

compress temporaries with PROG; decompress them with PROG -d