Hadoop fs -du-h sorting by size for M, G, T, P, E, Z, Y

Mayur Narang picture Mayur Narang · Jun 28, 2016 · Viewed 6.9k times · Source

I am running this command --

sudo -u hdfs hadoop fs -du -h /user | sort -nr 

and the output is not sorted in terms of gigs, Terabytes,gb

I found this command -

hdfs dfs -du -s /foo/bar/*tobedeleted | sort -r -k 1 -g | awk '{ suffix="KMGT"; for(i=0; $1>1024 && i < length(suffix); i++) $1/=1024; print int($1) substr(suffix, i, 1), $3; }' 

but did not seem to work.

is there a way or a command line flag i can use to make it sort and output should look like--

123T  /xyz
124T  /xyd
126T  /vat
127G  /ayf
123G  /atd

Please help

regards Mayur

Answer

Li Su picture Li Su · Aug 5, 2019
hdfs dfs -du -h <PATH> | awk '{print $1$2,$3}' | sort -hr

Short explanation:

  • The hdfs command gets the input data.
  • The awk only prints the first three fields with a comma in between the 2nd and 3rd.
  • The -h of sort compares human readable numbers like 2K or 4G, while the -r reverses the sort order.