Top "Mapreduce" questions

MapReduce is an algorithm for processing huge datasets on certain kinds of distributable problems using a large number of nodes

Map Reducing object with underscore

I want to reduce this object to just an object containing product name and average price. What's the fastest way …

javascript map mapreduce underscore.js reduce
python - PipeMapRed.waitOutputThreads(): subprocess failed with code 1

Recently, I want to parse websites and then use BeautifulSoup to filter what I want and write in csv file …

mapreduce beautifulsoup hadoop-streaming
Fault Tolerance in MapReduce

I was reading about Hadoop and how fault tolerant it is. I read the HDFS and read how failure of …

mapreduce distributed-computing fault-tolerance
Hadoop ChainMapper, ChainReducer

I'm relatively new to Hadoop and trying to figure out how to programmatically chain jobs (multiple mappers, reducers) with ChainMapper, …

hadoop mapreduce chaining
“Combiner" Class in a mapreduce job

A Combiner runs after the Mapper and before the Reducer,it will receive as input all data emitted by the …

hadoop mapreduce reducers combiners
Mongo Map Reduce first time

First time Map/Reduce user here, and using MongoDB. I have a lot of page visit data which I'd like …

php mongodb mapreduce mongodb-php
MapReduce alternatives

Are there any alternative paradigms to MapReduce (Google, Hadoop)? Is there any other reasonable way how to split & merge …

algorithm hadoop mapreduce
Difference between combiner and partitioner

I am a newbie to MapReduce and I just can't figure out the difference in the partitioner and combiner. I …

hadoop mapreduce partitioner
NoClassDefFoundError: org/apache/commons/lang/StringUtils

I am writing map reduce program to compare two files.When I run the program it throws following exception. Exception …

java hadoop mapreduce apache-stringutils
Why 'mapred-site.xml' is not included in the latest Hadoop 2.2.0?

Latest build of Hadoop provides mapred-site.xml.template Do we need to create a new mapred-site.xml file using this? …

apache hadoop mapreduce bigdata