Top "Mapreduce" questions

MapReduce is an algorithm for processing huge datasets on certain kinds of distributable problems using a large number of nodes

How to decide when to use a Map-Side Join or Reduce-Side while writing an MR code in java?

How to decide when to use a Map-Side Join or Reduce-Side while writing an MR code in java?

hadoop mapreduce hadoop-streaming
array_reduce() can't work as associative-array "reducer" for PHP?

I have an associative array $assoc, and need to reduce to it to a string, in this context $OUT = "<…

php mapreduce associative-array
Sorted word count using Hadoop MapReduce

I'm very much new to MapReduce and I completed a Hadoop word-count example. In that example it produces unsorted file (…

hadoop mapreduce word-count parallel-processing
Usecases for mapred.job.queue.name

What are the real world use cases on using map reduce job queues i.e. the value of mapred.job.…

hadoop mapreduce cloudera hortonworks-data-platform
20 Billion Rows/Month - Hbase / Hive / Greenplum / What?

I'd like to use your wisdom for picking up the right solution for a data-warehouse system. Here are some details …

database mapreduce data-warehouse greenplum vldb
Cognitive Complexity and its effect on the code

W.r.t to one of the java projects, we recently started using SonarLint. Output of the code analysis shows …

java algorithm mapreduce refactoring sonarlint
Hadoop JobConf class is deprecated , need updated example

I am writing hadoop programs , and i really dont want to play with deprecated classes . Anywhere online i am not …

hadoop mapreduce cloudera
Rolling your own reduceByKey in Spark Dataset

I'm trying to learn to use DataFrames and DataSets more in addition to RDDs. For an RDD, I know I …

scala apache-spark mapreduce
Hadoop : Provide directory as input to MapReduce job

I'm using Cloudera Hadoop. I'm able to run simple mapreduce program where I provide a file as input to MapReduce …

java hadoop input mapreduce cloudera
Is Mongodb Aggregation framework faster than map/reduce?

Is the aggregation framework introduced in mongodb 2.2, has any special performance improvements over map/reduce? If yes, why and how …

performance mongodb mapreduce aggregation-framework