I'm trying to install Spark on my Mac. I've used home-brew to install spark 2.4.0 and Scala. I've installed PySpark in my anaconda environment and am using PyCharm for development. I've exported to my bash profile:
export SPARK_VERSION=`ls /usr/local/Cellar/apache-spark/ | sort | tail -1`
export SPARK_HOME="/usr/local/Cellar/apache-spark/$SPARK_VERSION/libexec"
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH
However I'm unable to get it to work.
I suspect this is due to java version from reading the traceback. I would really appreciate some help fixed the issue. Please comment if there is any information I could provide that is helpful beyond the traceback.
I am getting the following error:
Traceback (most recent call last):
File "<input>", line 4, in <module>
File "/anaconda3/envs/coda/lib/python3.6/site-packages/pyspark/rdd.py", line 816, in collect
sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
File "/anaconda3/envs/coda/lib/python3.6/site-packages/py4j/java_gateway.py", line 1257, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/anaconda3/envs/coda/lib/python3.6/site-packages/py4j/protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.lang.IllegalArgumentException: Unsupported class file major version 55
Edit Spark 3.0 supports Java 11, so you'll need to upgrade
Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.1+. Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.0
Original answer
Until Spark supports Java 11, or higher (which would be hopefully be mentioned at the latest documentation when it is), you have to add in a flag to set your Java version to Java 8.
As of Spark 2.4.x
Spark runs on Java 8, Python 2.7+/3.4+ and R 3.1+. For the Scala API, Spark 2.4.4 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x)
On Mac/Unix, see asdf-java for installing different Javas
On a Mac, I am able to do this in my .bashrc
,
export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)
On Windows, checkout Chocolately, but seriously just use WSL2 or Docker to run Spark.
You can also set this in spark-env.sh
rather than set the variable for your whole profile.
And, of course, this all means you'll need to install Java 8 in addition to your existing Java 11