How to find the size of a HDFS file

priya picture priya · Jul 20, 2012 · Viewed 90.9k times · Source

How to find the size of a HDFS file? What command should be used to find the size of any file in HDFS.

Answer

Paul M picture Paul M · Jul 20, 2012

I also find myself using hadoop fs -dus <path> a great deal. For example, if a directory on HDFS named "/user/frylock/input" contains 100 files and you need the total size for all of those files you could run:

hadoop fs -dus /user/frylock/input

and you would get back the total size (in bytes) of all of the files in the "/user/frylock/input" directory.

Also, keep in mind that HDFS stores data redundantly so the actual physical storage used up by a file might be 3x or more than what is reported by hadoop fs -ls and hadoop fs -dus.