Top "Mapreduce" questions

MapReduce is an algorithm for processing huge datasets on certain kinds of distributable problems using a large number of nodes

Can OLAP be done in BigTable?

In the past I used to build WebAnalytics using OLAP cubes running on MySQL. Now an OLAP cube the way …

hadoop olap mapreduce hbase hive
Getting the Tool Interface warning even though it is implemented

I have a very simple "Hello world" style map/reduce job. public class Tester extends Configured implements Tool { @Override public …

java hadoop mapreduce hortonworks-data-platform
How to get print output for debugging map/reduce in Mongoid?

I'm writing a map/reduce operation with Mongoid 3.0. I'm trying to use the print statement to debug the JS functions. …

mongodb mapreduce mongoid mongoid3
In MongoDB mapreduce, how can I flatten the values object?

I'm trying to use MongoDB to analyse Apache log files. I've created a receipts collection from the Apache access logs. …

mongodb mapreduce
How to load data from HDFS sequencefile in python

I have a map reduce program running to read the HDFS file as below: hadoop jar /opt/mapr/hadoop/hadoop-0.20.2/…

python hadoop mapreduce hive sequencefile
Aggregate Functions over a List in JAVA

I have a list of Java Objects and I need to reduce it applying Aggregate Functions like a select over …

java database mapreduce data-processing
storing images in HBASE for processing and quick access

I have a large number of image files that I need to store and process on HDFS Let's assume 2 scenarios: …

image hadoop mapreduce hbase random-access
is the output of map phase of the mapreduce job always sorted?

I am a bit confused with the output I get from Mapper. For example, when I run a simple wordcount …

hadoop mapreduce hadoop2
Hadoop combiner sort phase

When running a MapReduce job with a specified combiner, is the combiner run during the sort phase? I understand that …

hadoop mapreduce combiners
Pig - ERROR 1045: AVG as multiple or none of them fit. Please use an explicit cast

I have a comma seperated .txt file, I want to DUMP the AVG age of all Males. records = LOAD 'file:/…

hadoop mapreduce apache-pig bigdata