Using Kaggle Datasets in Google Colab

hdiz picture hdiz · Mar 15, 2018 · Viewed 31.4k times · Source

Is it possible to use any datasets available via the kaggle API in Google Colab? I see the Kaggle API is used in this Colab notebook, but it's a bit unclear to me what datasets it provides access to.

Answer

Bob Smith picture Bob Smith · Jun 1, 2018

Step-by-step --

  1. Create an API key in Kaggle.

    To do this, go to kaggle.com/ and open your user settings page. settings nav

  2. Next, scroll down to the API access section and click generate to download an API key. api token This will download a file called kaggle.json to your computer. You'll use this file in Colab to access Kaggle datasets and competitions.

  3. Navigate to https://colab.research.google.com/.

  4. Upload your kaggle.json file using the following snippet in a code cell:

    from google.colab import files files.upload()

  5. Install the kaggle API using !pip install -q kaggle

  6. Move the kaggle.json file into ~/.kaggle, which is where the API client expects your token to be located:

    !mkdir -p ~/.kaggle !cp kaggle.json ~/.kaggle/

  7. Now you can access datasets using the client, e.g., !kaggle datasets list.

Here's a complete example notebook of the Colab portion of this process: https://colab.research.google.com/drive/1DofKEdQYaXmDWBzuResXWWvxhLgDeVyl

This example shows uploading the kaggle.json file, the Kaggle API client, and using the Kaggle client to download a dataset.