I am using Hadoop-2.4.0
and my system configs are 24 cores, 96 GB RAM.
I am using following configs
mapreduce.map.cpu.vcores=1
yarn.nodemanager.resource.cpu-vcores=10
yarn.scheduler.minimum-allocation-vcores=1
yarn.scheduler.maximum-allocation-vcores=4
yarn.app.mapreduce.am.resource.cpu-vcores=1
yarn.nodemanager.resource.memory-mb=88064
mapreduce.map.memory.mb=3072
mapreduce.map.java.opts=-Xmx2048m
Capacity Scheduler configs
queue.default.capacity=50
queue.default.maximum_capacity=100
yarn.scheduler.capacity.root.default.user-limit-factor=2
With above configs, I expect yarn won't launch more than 10 mappers per node, but It is launching 28 mappers per node. Am I doing something wrong??
YARN is running more containers than allocated cores because by default DefaultResourceCalculator is used. It considers only memory.
public int computeAvailableContainers(Resource available, Resource required) {
// Only consider memory
return available.getMemory() / required.getMemory();
}
Use DominantResourceCalculator, It uses both cpu and memory.
Set below config in capacity-scheduler.xml
yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
More about DominantResourceCalculator