Apache Spark: setting executor instances does not change the executors

user4688877 picture user4688877 · Apr 29, 2015 · Viewed 29.9k times · Source

I have an Apache Spark application running on a YARN cluster (spark has 3 nodes on this cluster) on cluster mode.

When the application is running the Spark-UI shows that 2 executors (each running on a different node) and the driver are running on the third node. I want the application to use more executors so I tried adding the argument --num-executors to Spark-submit and set it to 6.

spark-submit --driver-memory 3G --num-executors 6 --class main.Application --executor-memory 11G --master yarn-cluster myJar.jar <arg1> <arg2> <arg3> ...

However, the number of executors remains 2.

On spark UI I can see that the parameter spark.executor.instances is 6, just as I intended, and somehow there are still only 2 executors.

I even tried setting this parameter from the code

sparkConf.set("spark.executor.instances", "6")

Again, I can see that the parameter was set to 6, but still there are only 2 executors.

Does anyone know why I couldn't increase the number of my executors?

yarn.nodemanager.resource.memory-mb is 12g in yarn-site.xml

Answer

banjara picture banjara · Apr 29, 2015

Increase yarn.nodemanager.resource.memory-mb in yarn-site.xml

With 12g per node you can only launch driver(3g) and 2 executors(11g).

Node1 - driver 3g (+7% overhead)

Node2 - executor1 11g (+7% overhead)

Node3 - executor2 11g (+7% overhead)

now you are requesting for executor3 of 11g and no node has 11g memory available.

for 7% overhead refer spark.yarn.executor.memoryOverhead and spark.yarn.driver.memoryOverhead in https://spark.apache.org/docs/1.2.0/running-on-yarn.html