pyspark error does not exist in the jvm error when initializing SparkContext

thebeancounter picture thebeancounter · Nov 5, 2018 · Viewed 20.3k times · Source

I am using spark over emr and writing a pyspark script, I am getting an error when trying to

from pyspark import SparkContext
sc = SparkContext()

this is the error

File "pyex.py", line 5, in <module>
    sc = SparkContext()   File "/usr/local/lib/python3.4/site-packages/pyspark/context.py", line 118, in __init__
    conf, jsc, profiler_cls)   File "/usr/local/lib/python3.4/site-packages/pyspark/context.py", line 195, in _do_init
    self._encryption_enabled = self._jvm.PythonUtils.getEncryptionEnabled(self._jsc)   File "/usr/local/lib/python3.4/site-packages/py4j/java_gateway.py", line 1487, in __getattr__
    "{0}.{1} does not exist in the JVM".format(self._fqn, name)) py4j.protocol.Py4JError: org.apache.spark.api.python.PythonUtils.getEncryptionEnabled does not exist in the JVM

I found this answer stating that I need to import sparkcontext but this is not working also.

Answer

svw picture svw · Nov 7, 2018

PySpark recently released 2.4.0, but there's no stable release for spark coinciding with this new version. Try downgrading to pyspark 2.3.2, this fixed it for me

Edit: to be more clear your PySpark version needs to be the same as the Apache Spark version that is downloaded, or you may run into compatibility issues

Check the version of pyspark by using

pip freeze