HDFS Reduced Replication Factor

Carl Sagan picture Carl Sagan · Jul 23, 2013 · Viewed 9.2k times · Source

I've reduced the replication factor from 3 to 1, yet do not see any activity from the namenode or between datanodes to remove overly-replicated HDFS file blocks. Is there a way to monitor or force the replication job?

Answer

Charles Menguy picture Charles Menguy · Jul 23, 2013

Changing dfs.replication will only apply to new files you create, but will not modify the replication factor for the already existing files.

To change replication factor for files that already exist, you could run the following command which will be run recursively on all files in HDFS:

hadoop dfs -setrep -w 1 -R /