Recently I start learning Hadoop and Mahout. I want to know the path to directory within Hadoop filesystem directory.
In hadoop-1.2.1/conf/core-site.xml, I have specified:
<property>
<name>hadoop.tmp.dir</name>
<value>/Users/Li/File/Java/hdfstmp</value>
<description>A base for other temporary directories.</description>
</property>
In Hadoop filesystem, I have the following directories:
lis-macbook-pro:Java Li$ hadoop fs -ls
Found 4 items
drwxr-xr-x - Li supergroup 0 2013-11-06 17:25 /user/Li/output
drwxr-xr-x - Li supergroup 0 2013-11-06 17:24 /user/Li/temp
drwxr-xr-x - Li supergroup 0 2013-11-06 14:50 /user/Li/tweets-seq
-rw-r--r-- 1 Li supergroup 1979173 2013-11-05 15:50 /user/Li/u.data
Now where is /user/Li/output directory?
I tried:
lis-macbook-pro:usr Li$ cd /user/Li/output
-bash: cd: /user/Li/output: No such file or directory
So I think /user/Li/output is a relative path not an absolute path.
Then I search for it in /Users/Li/File/Java/hdfstmp. There are two folders:
dfs
mapred
But still I cant find /user/Li/output within /Users/Li/File/Java/hdfstmp.
Your first call to hadoop fs -ls
is a relative directory listing, for the current user typically rooted in a directory called /user/${user.name}
in HDFS. So your hadoop fs -ls
command is listing files / directories relative to this location - in your case /user/Li/
You should be able to assert this by running a aboolute listing and confirm the contents / output match: hadoop fs -ls /user/Li/
As these files are in HDFS, you will not be able to find them on the local filesystem - they are distributed across your cluster nodes as blocks (for real files), and metadata entries (for files and directories) in the NameNode.