Top "Hdfs" questions

Hadoop Distributed File System (HDFS) is the default file storage system used by Apache Hadoop.

Hive load CSV with commas in quoted fields

I am trying to load a CSV file into a Hive table like so: CREATE TABLE mytable ( num1 INT, text1 …

hadoop hbase hive hdfs delimiter
How to list all files in a directory and its subdirectories in hadoop hdfs

I have a folder in hdfs which has two subfolders each one has about 30 subfolders which,finally,each one contains …

java hadoop hdfs
How to copy data from one HDFS to another HDFS?

I have two HDFS setup and want to copy (not migrate or move) some tables from HDFS1 to HDFS2. How …

hadoop hdfs bigdata sqoop
What are the pros and cons of parquet format compared to other formats?

Characteristics of Apache Parquet are : Self-describing Columnar format Language-independent In comparison to Avro, Sequence Files, RC File etc. I want …

file hadoop hdfs avro parquet
Write a file in hdfs with Java

I want to create a file in HDFS and write data in that. I used this code: Configuration config = new …

java hadoop hdfs
How to find the size of a HDFS file

How to find the size of a HDFS file? What command should be used to find the size of any …

hadoop hdfs
Is there a hdfs command to list files in HDFS directory as per timestamp

Is there a hdfs command to list files in HDFS directory as per timestamp, ascending or descending? By default, hdfs …

hadoop hdfs
How to delete files from the HDFS?

I just downloaded Hortonworks sandbox VM, inside it there are Hadoop with the version 2.7.1. I adding some files by using …

hadoop hdfs hortonworks-data-platform
HDFS free space available command

Is there a hdfs command to see available free space in hdfs. We can see that through browser at master:…

hadoop hdfs
SparkSQL - Read parquet file directly

I am migrating from Impala to SparkSQL, using the following code to read a table: my_data = sqlContext.read.parquet(…

scala apache-spark hive apache-spark-sql hdfs