Is there the equivalent for a `find` command in `hadoop`?

makansij picture makansij · Oct 1, 2015 · Viewed 13.2k times · Source

I know that from the terminal, one can do a find command to find files such as :

find . -type d -name "*something*" -maxdepth 4 

But, when I am in the hadoop file system, I have not found a way to do this.

hadoop fs -find ....

throws an error.

How do people traverse files in hadoop? I'm using hadoop 2.6.0-cdh5.4.1.

Answer

Legato picture Legato · Oct 1, 2015

hadoop fs -find was introduced in Apache Hadoop 2.7.0. Most likely you're using an older version hence you don't have it yet. see: HADOOP-8989 for more information.

In the meantime you can use

hdfs dfs -ls -R <pattern>

e.g,: hdfs dfs -ls -R /demo/order*.*

but that's not as powerful as 'find' of course and lacks some basics. From what I understand people have been writing scripts around it to get over this problem.