What is the purpose of "uber mode" in hadoop?

Mohammed Asad picture Mohammed Asad · May 17, 2015 · Viewed 26.4k times · Source

Hi I am a big data newbie. I searched all over the internet to find what exactly uber mode is. The more I searched the more I got confused. Can anybody please help me by answering my questions?

  • What does uber mode do?
  • Does it works differently in mapred 1.x and 2.x?
  • And where can I find the setting for it?

Answer

Rahul Reddy picture Rahul Reddy · Jul 6, 2015

What is UBER mode in Hadoop2?

Normally mappers and reducers will run by ResourceManager (RM), RM will create separate container for mapper and reducer. Uber configuration, will allow to run mapper and reducers in the same process as the ApplicationMaster (AM).

Uber jobs :

Uber jobs are jobs that are executed within the MapReduce ApplicationMaster. Rather then communicate with RM to create the mapper and reducer containers. The AM runs the map and reduce tasks within its own process and avoided the overhead of launching and communicate with remote containers.

Why

If you have a small dataset or you want to run MapReduce on small amount of data, Uber configuration will help you out, by reducing additional time that MapReduce normally spends in mapper and reducers phase.

Can I configure an Uber for all MapReduce job?

As of now, map-only jobs and jobs with one reducer are supported.