How to read csv to dataframe in Google Colab

PagMax picture PagMax · Jan 19, 2018 · Viewed 75.4k times · Source

I am trying to read a csv file which I stored locally on my machine. (Just for additional reference it is titanic data from Kaggle which is here.)

From this question and answers I learnt that you can import data using this code which works well from me.

from google.colab import files
uploaded = files.upload()

Where I am lost is how to convert it to dataframe from here. The sample google notebook page listed in the answer above does not talk about it.

I am trying to convert the dictionary uploaded to dataframe using from_dict command but not able to make it work. There is some discussion on converting dict to dataframe here but the solutions are not applicable to me (I think).

So summarizing, my question is:

How do I convert a csv file stored locally on my files to pandas dataframe on Google Colaboratory?

Answer

Bob Smith picture Bob Smith · Jan 19, 2018

Pandas read_csv should do the trick. You'll want to wrap your uploaded bytes in an io.StringIO since read_csv expects a file-like object.

Here's a full example: https://colab.research.google.com/notebook#fileId=1JmwtF5OmSghC-y3-BkvxLan0zYXqCJJf

The key snippet is:

import pandas as pd
import io

df = pd.read_csv(io.StringIO(uploaded['train.csv'].decode('utf-8')))
df