I have tab delimited files with several columns. I want to count the frequency of occurrence of the different values in a column for all the files in a folder and sort them in decreasing order of count (highest count first). How would I accomplish this in a Linux command line environment?
It can use any common command line language like awk, perl, python etc.
To see a frequency count for column two (for example):
awk -F '\t' '{print $2}' * | sort | uniq -c | sort -nr
fileA.txt
z z a
a b c
w d e
fileB.txt
t r e
z d a
a g c
fileC.txt
z r a
v d c
a m c
Result:
3 d
2 r
1 z
1 m
1 g
1 b