I am trying to develop a spark application on Eclipse, and then debug it by stepping through it.
I downloaded the Spark source code and I have added some of the spark sub projects(such as spark-core) to Eclipse. Now, I am trying to develop a spark application using Eclipse. I have already installed the ScalaIDE on Eclipse. I created a simple application based on the example given in the Spark website.
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args: Array[String]) {
val logFile = "YOUR_SPARK_HOME/README.md" // Should be some file on your system
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
To my project, I added the spark-core
project as a dependent project(right click -> build path -> add project). Now, I am trying to build my application and run it. However, my project shows that it has errors, but I don't see any errors listed in the problems view within Eclipse, nor do I see any lines highlighted in red. So, I am not sure what the problem is. My assumption is that I need to add external jars to my project, but I am not sure what these jars would be. The error is caused by val conf = new SparkConf().setAppName("Simple Application")
and the subsequent lines. I tried removing those lines, and the error went away. I would appreciate any help and guidance, thanks!
It seems you are not using any package/library manager (e.g. sbt, maven) which should eliminate any versioning issues. It might be challenging to set correct version of java, scala, spark and all its subsequent dependencies on your own. I strongly recommend to change your your project into Maven: Convert Existing Eclipse Project to Maven Project
Personally, I have very good experiences with sbt on IntelliJ IDEA (https://confluence.jetbrains.com/display/IntelliJIDEA/Getting+Started+with+SBT) which is easy to set up and maintain.