Top "Hdfs" questions

Hadoop Distributed File System (HDFS) is the default file storage system used by Apache Hadoop.

hadoop fs -ls results in "no such file or directory"

I have installed and configured Hadoop 2.5.2 for a 10 node cluster. 1 is acting as masternode and other nodes as slavenodes. I …

hadoop uri hdfs
Spark-submit not working when application jar is in hdfs

I'm trying to run a spark application using bin/spark-submit. When I reference my application jar inside my local filesystem, …

hadoop apache-spark hdfs
Difference between `yarn.scheduler.maximum-allocation-mb` and `yarn.nodemanager.resource.memory-mb`?

What is difference between yarn.scheduler.maximum-allocation-mb and yarn.nodemanager.resource.memory-mb? I see both of these in yarn-site.xml …

hadoop memory-management hdfs yarn
Transfer file out from HDFS

I want to transfer files out from HDFS to local filesystem of a different server which is not in hadoop …

hadoop hdfs data-transfer
How do I read Snappy compressed files on HDFS without using Hadoop?

I'm storing files on HDFS in Snappy compression format. I'd like to be able to examine these files on my …

hadoop compression hdfs snappy
How to convert sas7bdat file to csv?

I want to convert a .sas7bdat file to a .csv/txt format so that I can upload it into …

csv hadoop hive sas hdfs
Python: save pandas data frame to parquet file

Is it possible to save a pandas data frame directly to a parquet file? If not, what would be the …

python-3.x hdfs parquet
sqoop import multiple tables

We are using Cloudera CDH 4 and we are able to import tables from our Oracle databases into our HDFS warehouse …

hadoop hive hdfs sqoop
how to merge multiple parquet files to single parquet file using linux or hdfs command?

I have multiple small parquet files generated as output of hive ql job, i would like to merge the output …

hdfs parquet
Search a table in all databases in hive

In Hive, how do we search a table by name in all databases? I am a Teradata user. Is there …

hadoop hive hdfs hiveql