What is default password for Jupyter created on google's data proc

Watt picture Watt · Dec 13, 2016 · Viewed 8k times · Source

I set data proc using the steps in link here

https://cloud.google.com/dataproc/docs/tutorials/jupyter-notebook

But my jyputer keep asking for password

enter image description here

I didn't set any password.

I tried my google account password that doesn't work

I ran ../root$ sudo grep -ir password and get following, so that confirmed no password is set

.jupyter/jupyter_notebook_config.py:## Hashed password to use for web authentication.
.jupyter/jupyter_notebook_config.py:#  The string should be of the form type:salt:hashed-password.
.jupyter/jupyter_notebook_config.py:#c.NotebookApp.password = u''
.jupyter/jupyter_notebook_config.py:#  Only used when no password is enabled.
.local/share/jupyter/runtime/nbserver-3668.json:  "password": false, 

Answer

Dennis Huo picture Dennis Huo · Dec 13, 2016

Since the initialization action just installs from latest using conda install jupyter, this appears to have been caused by a recent upstream change, specifically upgrading the notebook component from 4.2.3 to 4.3.0 causing token-based auth to be turned on by default. A recent cluster I deployed a couple weeks ago using the out-of-the-box init action didn't have the same login you're seeing; the design of the init action is to let Google Compute Engine firewalls be your layer of defense and the SSH tunnel being your secure connection, rather than relying on various third-party implementations of auth from the different Hadoop/Spark tools and web UIs.

The solution will be to add a line to setup-jupyter-kernel.sh:

echo "c.NotebookApp.token = u''" >> ~/.jupyter/jupyter_notebook_config.py

to disable jupyter-side authentication altogether and revert to the behavior a couple weeks ago. Note that if you want to do this yourself you'll have to fiddle with the INIT_ACTIONS_REPO and INIT_ACTIONS_BRANCH settings in jupyter.sh which may take some getting used to if you haven't been customizing it already. We'll try to push a fix as soon as possible and once that's done you should be able to use the out-of-the-box init action without causing the login screen again.

If you already have a cluster running, you can disable the auth for your jupyter server by running that manually as root after SSH'ing into the master:

sudo su
killall -9 jupyter-notebook
echo "c.NotebookApp.token = u''" >> ~/.jupyter/jupyter_notebook_config.py
/dataproc-initialization-actions/jupyter/internal/launch-jupyter-kernel.sh

Alternatively, if you do want to keep the new default token-authorization approach, the jupyter server actually logs a generated token to /var/log/jupyter_notebook.log; look for a line stating The Jupyter Notebook is running at: http://[all ip addresses on your system]:8123/?token=[some-token-string-here]; that token string can be plugged in to the password field or in the URL parameter as it shows.

EDIT: The fix has now been committed into Dataproc's init action repository and synced to gs://dataproc-initialization-actions. Deployments out-of-the-box once again work without an extra login page in the Jupyter UI.

A new metadata option has also been added if you do want to specify a token which Jupyter also allows to be used in the password field, with key JUPYTER_AUTH_TOKEN. Use it as follows only if you want a login page requesting your specified token (no metadata keys are necessary if you just want the old behavior of no login page):

gcloud dataproc clusters create \
    --initialization-actions gs://dataproc-initialization-actions/jupyter/jupyter.sh \
    --metadata JUPYTER_AUTH_TOKEN=foobarbaz

Then your login password will be foobarbaz.