Installing Scala kernel (or Spark/Toree) for Jupyter (Anaconda)

robromo picture robromo · Sep 16, 2016 · Viewed 7.2k times · Source

I'm running RHEL 6.7, and have Anaconda set up. (anaconda 4.10). Jupyter is working OOTB, and it by default has the Python kernel. Everything is dandy so I can select "python notebook" in Jupyter.

I'm now looking to get Scala set up with Jupyter as well. (which it seems like Spark kernel - now Toree will work?)

Every question/answer I've seen in regards to it - is not referencing the issue I'm running into.

I was trying to install Toree, and did

sudo pip install toree 

and it worked. But then the next step is too

jupyter toree install

And the error I get is:

jupyter toree install
Traceback (most recent call last):
  File "/usr/app/anaconda/bin/jupyter-toree", line 7, in <module>
    from toree.toreeapp import main
ImportError: No module named toree.toreeapp

Am I missing a step? Anything I'm doing wrong? If i need to provide more information, I will be glad too. Thanks!

Edit: What is the standard/easiest/reliable way to get a Scala notebook in Jupyter? (tl;dr)

Answer

user6273920 picture user6273920 · Sep 22, 2016

If you are trying to get spark 2.0 with 2.11 you may get strange msgs. You need to update to latest toree 0.2.0 For Ubuntu 16.04 64bit. I have package & tgz file in https://anaconda.org/hyoon/toree

That's for python 2.7 & you will need conda. If you don't know how, then just download tgz then

tar zxvf toree-0.2.0.dev1.tar.gz
pip install -e toree-0.2.0.dev1

And I prefer to:

jupyter toree install --interpreters=Scala --spark_home=/opt/spark --user --kernel_name=apache_toree --interpreters=PySpark,SparkR,Scala,SQL

Which will create kernels in ~/.local/share/jupyter/kernels (--user is the key)

Happy sparking!