How to do natural sort on uniq -c
output?
When the counts are <10, the uniq -c | sort
output looks fine:
alvas@ubi:~/testdir$ echo -e "aaa\nbbb\naa\ncd\nada\naaa\nbbb\naa\nccd\naa" > test.txt
alvas@ubi:~/testdir$ cat test.txt
aaa
bbb
aa
cd
ada
aaa
bbb
aa
ccd
aa
alvas@ubi:~/testdir$ cat test.txt | sort | uniq -c | sort
1 ada
1 ccd
1 cd
2 aaa
2 bbb
3 aa
but when the counts are > 10 and even in thousands/hundreds the sort messes up because it's sorting by string and not by natural integer sort:
alvas@ubi:~/testdir$ echo -e "aaa\nbbb\naa\nnaa\nnaa\naa\nnaa\nnaa\nnaa\nnaa\nnaa\nnaa\nnaa\nnaa\nnnaa\ncd\nada\naaa\nbbb\naa\nccd\naa" > test.txt
alvas@ubi:~/testdir$ cat test.txt | sort | uniq -c | sort
10 naa
1 ada
1 ccd
1 cd
1 nnaa
2 aaa
2 bbb
4 aa
How to do natural sort output of "uniq -c" in descending/acsending order?
Use -n
in your sort
command, so that it sorts numerically. Also -r
allows you to reverse the result:
$ sort test.txt | uniq -c | sort -n
1 ada
1 ccd
1 cd
1 nnaa
2 aaa
2 bbb
4 aa
10 naa
$ sort test.txt | uniq -c | sort -nr
10 naa
4 aa
2 bbb
2 aaa
1 nnaa
1 cd
1 ccd
1 ada
From man sort
:
-n, --numeric-sort
compare according to string numerical value
-r, --reverse
reverse the result of comparisons