Top "Mapreduce" questions

MapReduce is an algorithm for processing huge datasets on certain kinds of distributable problems using a large number of nodes

Gradle Transitive dependency exclusion is not working as expected. (How do I get rid of com.google.guava:guava-jdk5:13.0 ?)

here is a snippet of my build.gradle: compile 'com.google.api-client:google-api-client:1.19.0' compile 'com.google.apis:google-api-services-oauth2:v2…

java google-app-engine mapreduce gradle guava
Is it better to use the mapred or the mapreduce package to create a Hadoop Job?

To create MapReduce jobs you can either use the old org.apache.hadoop.mapred package or the newer org.apache.…

hadoop mapreduce
MapReduce implementation in Scala

I'd like to find out good and robust MapReduce framework, to be utilized from Scala.

scala frameworks google-analytics mapreduce
Java 8 Stream function to group a List of anagrams into a Map of Lists

Java 8 is about to be released... While learning about Streams, I got into a scenario about grouping anagrams using one …

java mapreduce java-8 anagram java-stream
Hadoop: key and value are tab separated in the output file. how to do it semicolon-separated?

I think the title is already explaining my question. I would like to change key (tab space) value into key;…

map hadoop mapreduce reduce
Implementing PageRank using MapReduce

I'm trying to get my head around an issue with the theory of implementing the PageRank with MapReduce. I have …

algorithm mapreduce pagerank
Hadoop: job runs okay on smaller set of data but fails with large dataset

I have a following situation I have 3 machines cluster with following confirguration. Master Usage of /: 91.4% of 74.41GB MemTotal: 16557308 kB MemFree: 723736 …

java hadoop mapreduce hadoop-streaming
how to restrict the concurrent running map tasks?

My hadoop version is 1.0.2. Now I want at most 10 map tasks running at the same time. I have found 2 variable …

map hadoop mapreduce task jobs
Differences between MapReduce and Yarn

I was searching about hadoop and mapreduce with respect to straggler problems and the papers in this problem but yesterday …

hadoop mapreduce hadoop-yarn speculative-execution