Top "Hdfs" questions

Hadoop Distributed File System (HDFS) is the default file storage system used by Apache Hadoop.

How can I save an RDD into HDFS and later read it back?

I have an RDD whose elements are of type (Long, String). For some reason, I want to save the whole …

scala apache-spark hdfs rdd bigdata
How to update a file in HDFS

I know that HDFS is write once and read many times. Suppose if i want to update a file in …

hadoop hdfs hadoop2
Writing to HDFS from Java, getting "could only be replicated to 0 nodes instead of minReplication"

I’ve downloaded and started up Cloudera's Hadoop Demo VM for CDH4 (running Hadoop 2.0.0). I’m trying to write a …

java hadoop hdfs
Moving files in Hadoop using the Java API?

I want to move files around in HDFS using the Java APIs. I cannot figure out a way to do …

java hadoop hdfs
HDFS Home Directory

I have setup a single node multi-user hadoop cluster. In my cluster, there is an admin user that is responsible …

hadoop cluster-computing hdfs user-permissions
Is there a way to add nodes to a running Hadoop cluster?

I have been playing with Cloudera and I define the number of clusters before I start my job then use …

hadoop cluster-computing hbase hdfs cloudera
hdfs - ls: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException:

I am trying to use the below to list my dirs in hdfs: ubuntu@ubuntu:~$ hadoop fs -ls hdfs://127.0.0.1:50075/ ls: …

hadoop hdfs cloudera
HDFS: How do you list files recursively?

How do you, through Java, list all files (recursively) under a certain path in HDFS. I went through the API …

hadoop hdfs
What exactly Non DFS Used means?

This is what I saw on Web UI recently Configured Capacity : 232.5 GB DFS Used : 112.44 GB Non DFS Used : 119.46 GB DFS …

hadoop hdfs
Apache Spark Moving Average

I have a huge file in HDFS having Time Series data points (Yahoo Stock prices). I want to find the …

time-series hdfs moving-average apache-spark