What is the path to directory within Hadoop filesystem?

Li' picture Li' · Nov 12, 2013 · Viewed 47.6k times · Source

Recently I start learning Hadoop and Mahout. I want to know the path to directory within Hadoop filesystem directory.

In hadoop-1.2.1/conf/core-site.xml, I have specified:

<property>
  <name>hadoop.tmp.dir</name>
  <value>/Users/Li/File/Java/hdfstmp</value>
  <description>A base for other temporary directories.</description>
</property>

In Hadoop filesystem, I have the following directories:

lis-macbook-pro:Java Li$ hadoop fs -ls
Found 4 items
drwxr-xr-x   - Li supergroup          0 2013-11-06 17:25 /user/Li/output
drwxr-xr-x   - Li supergroup          0 2013-11-06 17:24 /user/Li/temp
drwxr-xr-x   - Li supergroup          0 2013-11-06 14:50 /user/Li/tweets-seq
-rw-r--r--   1 Li supergroup    1979173 2013-11-05 15:50 /user/Li/u.data

Now where is /user/Li/output directory?

I tried:

lis-macbook-pro:usr Li$ cd /user/Li/output
-bash: cd: /user/Li/output: No such file or directory

So I think /user/Li/output is a relative path not an absolute path.

Then I search for it in /Users/Li/File/Java/hdfstmp. There are two folders:

dfs

mapred

But still I cant find /user/Li/output within /Users/Li/File/Java/hdfstmp.

Answer

Chris White picture Chris White · Nov 13, 2013

Your first call to hadoop fs -ls is a relative directory listing, for the current user typically rooted in a directory called /user/${user.name} in HDFS. So your hadoop fs -ls command is listing files / directories relative to this location - in your case /user/Li/

You should be able to assert this by running a aboolute listing and confirm the contents / output match: hadoop fs -ls /user/Li/

As these files are in HDFS, you will not be able to find them on the local filesystem - they are distributed across your cluster nodes as blocks (for real files), and metadata entries (for files and directories) in the NameNode.