Top "Hdfs" questions

Hadoop Distributed File System (HDFS) is the default file storage system used by Apache Hadoop.

Spark - load CSV file as DataFrame?

I would like to read a CSV in spark and convert it as DataFrame and store it in HDFS with …

scala apache-spark hadoop apache-spark-sql hdfs
How to copy file from HDFS to the local file system

How to copy file from HDFS to the local file system . There is no physical location of a file under …

hadoop copy hdfs
hadoop copy a local file system folder to HDFS

I need to copy a folder from local file system to HDFS. I could not find any example of moving …

hadoop hdfs
Name node is in safe mode. Not able to leave

root# bin/hadoop fs -mkdir t mkdir: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /user/root/t. …

hadoop hdfs
The way to check a HDFS directory's size?

I know du -sh in common Linux filesystems. But how to do that with HDFS?

hadoop command-line directory hdfs
Where does Hive store files in HDFS?

I'd like to know how to find the mapping between Hive tables and the actual HDFS files (or rather, directories) …

hadoop hive hdfs
How to fix corrupt HDFS FIles

How does someone fix a HDFS that's corrupt? I looked on the Apache/Hadoop website and it said its fsck …

hadoop hdfs
Namenode not getting started

I was using Hadoop in a pseudo-distributed mode and everything was working fine. But then I had to restart my …

hadoop hdfs
Permission denied at hdfs

I am new to hadoop distributed file system, I have done complete installation of hadoop single node on my machine.…

shell security hadoop permissions hdfs
What is the purpose of shuffling and sorting phase in the reducer in Map Reduce Programming?

In Map Reduce programming the reduce phase has shuffling, sorting and reduce as its sub-parts. Sorting is a costly affair. …

sorting hadoop mapreduce hdfs shuffle