Hadoop namenode metadata

leon picture leon · Jun 7, 2011 · Viewed 12.6k times · Source

I am a bit confused by the Hadoop architecture.

  1. What kind of file metadata is stored in Hadoop Namenode? From Hadoop wiki, it says Namenode stores the entire system namespace. Does information like last modified time, created time, file size, owner, permissions and etc stored in Namenode?

  2. Does datanode store any metadata information?

  3. There is only one Namenode, can the metadata data exceed the server's limit?

  4. If a user wants to download a file from Hadoop, does he have to download it from the Namenode? I found the below architecure picture from web, it shows a client can direct write data to datanode? Is it true? enter image description here

Thanks!!!!!!!

Answer

David Gruzman picture David Gruzman · Jun 8, 2011

I think the following explanation can help you to better understand the HDFS architecture. You can consider Name node to be like FAT (file allocation table) + Directory data and Data nodes to be dumb block devices. When you want to read the file from the regular file system, you should go to Directory, then go to FAT, get locations of all relevant blocks and read them. The same happens with HDFS. When you want to read the file, you go to the Namenode, get the list blocks the given file have. This information about blocks will contain list of datanodes where this information sitting. After it you go to the datanode and get relevant blocks from them.