Spark architecture is entirely revolves around the concept of executors and cores. I would like to see practically how many executors and cores running for my spark application running in a cluster.
I was trying to use below snippet in my application but no luck.
val conf = new SparkConf().setAppName("ExecutorTestJob")
val sc = new SparkContext(conf)
conf.get("spark.executor.instances")
conf.get("spark.executor.cores")
Is there any way to get those values using SparkContext
Object or SparkConf
object etc..
getExecutorStorageStatus
and getExecutorMemoryStatus
both return the number of executors including driver.
like below example snippet.
/** Method that just returns the current active/registered executors
* excluding the driver.
* @param sc The spark context to retrieve registered executors.
* @return a list of executors each in the form of host:port.
*/
def currentActiveExecutors(sc: SparkContext): Seq[String] = {
val allExecutors = sc.getExecutorMemoryStatus.map(_._1)
val driverHost: String = sc.getConf.get("spark.driver.host")
allExecutors.filter(! _.split(":")(0).equals(driverHost)).toList
}
sc.getConf.getInt("spark.executor.instances", 1)
similarly get all properties and print like below you may get cores information as well..
sc.getConf.getAll.mkString("\n")
OR
sc.getConf.toDebugString
Mostly spark.executor.cores
for executors spark.driver.cores
driver should have this value.
EDIT But can be accessed using Py4J bindings exposed from SparkSession.
sc._jsc.sc().getExecutorMemoryStatus()