Top "Hadoop" questions

Hadoop is an Apache open-source project that provides software for reliable and scalable distributed computing.

How to get hadoop put to create directories if they don't exist

I have been using Cloudera's hadoop (0.20.2). With this version, if I put a file into the file system, but the …

hadoop hdfs cloudera put biginsights
How to get the input file name in the mapper in a Hadoop program?

How I can get the name of the input file within a mapper? I have multiple input files stored in …

hadoop mapreduce
Yarn MapReduce Job Issue - AM Container launch error in Hadoop 2.3.0

I have setup a 2 node cluster of Hadoop 2.3.0. Its working fine and I can successfully run distributedshell-2.2.0.jar example. But …

java hadoop mapreduce yarn
Data Replication error in Hadoop

I am implementing the Hadoop Single Node Cluster on my machine by following Michael Noll's tutorial and have come across …

hadoop replication
Writing to a file in Apache Spark

I am writing a Scala code that requires me to write to a file in HDFS. When I use Filewriter.…

scala hadoop apache-spark filewriter
Apache Hive How to round off to 2 decimal places?

Actually I'm looking for more details about the sum function in Apache Hive. Until now, I understood that I can …

hadoop hiveql
Why does "hadoop fs -mkdir" fail with Permission Denied?

I am using Cloudera on a VM machine that I am playing around with. Unfortunately I am having issues copying …

hadoop hdfs cloudera
Hbase client ConnectionLoss for /hbase error

I'm going completely crazy: Installed Hadoop/Hbase, all is running; /opt/jdk1.6.0_24/bin/jps 23261 ThriftServer 22582 QuorumPeerMain 21969 NameNode 23500 Jps 23021 HRegionServer 22211 TaskTracker 22891 …

java ruby hadoop hbase thrift
Hive unable to manually set number of reducers

I have the following hive query: select count(distinct id) as total from mytable; which automatically spawns: 1408 Mappers 1 Reducer I …

hadoop mapreduce hive
How to write to CSV in Spark

I'm trying to find an effective way of saving the result of my Spark Job as a csv file. I'm …

file csv hadoop apache-spark distributed-computing