Top "Hdfs" questions

Hadoop Distributed File System (HDFS) is the default file storage system used by Apache Hadoop.

Spark on yarn concept understanding

I am trying to understand how spark runs on YARN cluster/client. I have the following question in my mind. …

hadoop apache-spark hdfs yarn
Hdfs put VS webhdfs

I'm loading 28 GB file in hadoop hdfs using webhdfs and it takes ~25 mins to load. I tried loading same file …

hadoop hdfs webhdfs
Wildcard in Hadoop's FileSystem listing API calls

tl;dr: To be able to use wildcards (globs) in the listed paths, one simply has to use globStatus(...) instead …

java hadoop hdfs wildcard
Hadoop namenode : Single point of failure

The Namenode in the Hadoop architecture is a single point of failure. How do people who have large Hadoop clusters …

hadoop mapreduce hdfs yarn hadoop2
How to find optimal number of mappers when running Sqoop import and export?

I'm using Sqoop version 1.4.2 and Oracle database. When running Sqoop command. For example like this: ./sqoop import \ --fs <name …

oracle hadoop mapreduce hdfs sqoop
Spark Standalone Mode: How to compress spark output written to HDFS

Related to my other question, but distinct: someMap.saveAsTextFile("hdfs://HOST:PORT/out") If I save an RDD to HDFS, …

scala compression hdfs apache-spark
Amazon s3a returns 400 Bad Request with Spark

For checkout purpose I try to set up an Amazon S3 bucket as checkpoint file. val checkpointDir = "s3a://bucket-name/…

amazon-web-services amazon-s3 apache-spark hdfs spark-streaming
Sqoop import as OrC file

Is there any option in sqoop to import data from RDMS and store it as ORC file format in HDFS? …

hdfs rdbms sqoop
Could not find uri with key dfs.encryption.key.provider.uri to create a keyProvider in HDFS encryption for CDH 5.4

CDH Version: CDH5.4.5 Issue: When HDFS Encryption is enabled using KMS available in Hadoop CDH 5.4 , getting error while putting file …

hadoop encryption copy hdfs cloudera-cdh
How to remove an ambari service after they have been added

After you add a service to a node, how do you go about removing that service from say one node …

hadoop hdfs hortonworks-data-platform