WebHDFS vs HttpFS

Santiago Cepas picture Santiago Cepas · Jul 30, 2014 · Viewed 11.7k times · Source

What is the difference between the WebHDFS REST API and HttpFS?

If I understand correctly:

  • HttpFS is an independent service that exposes a REST API on top of HDFS
  • WebHDFS is a REST API built-into HDFS. It doen't require any further installation

Am I correct?

When would be advisable to use one instead of the other?

Answer

Likoed picture Likoed · Sep 3, 2014

I have read a article related with your question. following is the link.

https://www.linkedin.com/today/post/article/20140717115238-176301000-accessing-hdfs-using-the-webhdfs-rest-api-vs-httpfs

WebHDFS vs HttpFs Major difference between WebHDFS and HttpFs: WebHDFS needs access to all nodes of the cluster and when some data is read it is transmitted from that node directly, whereas in HttpFs, a singe node will act similar to a "gateway" and will be a single point of data transfer to the client node. So, HttpFs could be choked during a large file transfer but the good thing is that we are minimizing the footprint required to access HDFS.