Top "Spark-submit" questions

spark-submit is a script that is able to run apache-spark code written in e.g. java, scala or python

How to stop INFO messages displaying on spark console?

I'd like to stop various messages that are coming on spark shell. I tried to edit the log4j.properties …

apache-spark log4j spark-submit
Add jars to a Spark Job - spark-submit

True ... it has been discussed quite a lot. However there is a lot of ambiguity and some of the answers …

java scala apache-spark jar spark-submit
Spark Driver Memory and Executor Memory

I am beginner to Spark and I am running my application to read 14KB data from text filed, do some …

java apache-spark spark-streaming spark-submit
List of spark-submit options

There are a ton of tunable settings mentioned on Spark configurations page. However as told here, the SparkSubmitOptionParser attribute-name for …

apache-spark spark-submit
How to pass external parameters through Spark submit

In my Application, i need to connect to the database so i need to pass IP address and database name …

java apache-spark spark-submit
How to spark-submit a python file in spark 2.1.0?

I am currently running spark 2.1.0. I have worked most of the time in PYSPARK shell, but I need to spark-submit …

apache-spark pyspark apache-spark-sql pyspark-sql spark-submit
How to execute spark submit on amazon EMR from Lambda function?

I want to execute spark submit job on AWS EMR cluster based on the file upload event on S3. I …

amazon-web-services apache-spark aws-lambda amazon-emr spark-submit
Best practice to create SparkSession object in Scala to use both in unittest and spark-submit

I have tried to write a transform method from DataFrame to DataFrame. And I also want to test it by …

scala apache-spark spark-submit
Spark: How to set spark.yarn.executor.memoryOverhead property in spark-submit

In Spark 2.0. How do you set the spark.yarn.executor.memoryOverhead when you run spark submit. I know for things …

apache-spark pyspark spark-submit
Spark java.lang.OutOfMemoryError : Java Heap space

I am geting the above error when i run a model training pipeline with spark `val inputData = spark.read .option("…

apache-spark out-of-memory spark-submit