TaskSchedulerImpl: Initial job has not accepted any resources;

sandeep picture sandeep · Apr 6, 2015 · Viewed 18.7k times · Source

Here is what I am trying to do.

I have created two nodes of DataStax enterprise cluster,on top of which I have created a java program to get the count of one table (Cassandra database table).

This program was built in eclipse which is actually from a windows box.

At the time of running this program from windows it's failing with the following error at runtime:

Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

The same code has been compiled & run successfully on those clusters without any issue. What could be the reason why am getting above error?

Code:

import org.apache.spark.SparkConf;

import org.apache.spark.SparkContext;

import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.sql.SchemaRDD;
import org.apache.spark.sql.cassandra.CassandraSQLContext;
import com.datastax.bdp.spark.DseSparkConfHelper;

public class SparkProject  {

    public static void main(String[] args) {

        SparkConf conf = DseSparkConfHelper.enrichSparkConf(new SparkConf()).setMaster("spark://10.63.24.14X:7077").setAppName("DatastaxTests").set("spark.cassandra.connection.host","10.63.24.14x").set("spark.executor.memory", "2048m").set("spark.driver.memory", "1024m").set("spark.local.ip","10.63.24.14X");

        JavaSparkContext sc = new JavaSparkContext(conf);

        CassandraSQLContext cassandraContext = new CassandraSQLContext(sc.sc());
        SchemaRDD employees = cassandraContext.sql("SELECT * FROM portware_ants.orders");

        //employees.registerTempTable("employees");
        //SchemaRDD managers = cassandraContext.sql("SELECT symbol FROM employees");
        System.out.println(employees.count());

        sc.stop();
    }
}

Answer

ameyamm picture ameyamm · May 28, 2015

I faced similar issue and after some online research and trial-n-error, I narrowed down to 3 causes for this (except for the first the other two are not even close to the error message):

  1. As indicated by the error, probably you are allocating the resources more than that is available. => This was not my issue
  2. Hostname & IP Address mishaps: I took care of this by specifying the SPARK_MASTER_IP and SPARK_LOCAL_IP in spark-env.sh
  3. Disable Firewall on the client : This was the solution that worked for me. Since I was working on a prototype in-house code, I disabled the firewall on the client node. For some reason the worker nodes, were not able to talk back to the client for me. For production purposes, you would want to open-up certain number of ports required.