How to read a pandas Series from a CSV file

gaborous picture gaborous · Apr 2, 2013 · Viewed 58.6k times · Source

I have a CSV file formatted as follows:

somefeature,anotherfeature,f3,f4,f5,f6,f7,lastfeature
0,0,0,1,1,2,4,5

And I try to read it as a pandas Series (using pandas daily snapshot for Python 2.7). I tried the following:

import pandas as pd
types = pd.Series.from_csv('csvfile.txt', index_col=False, header=0)

and:

types = pd.read_csv('csvfile.txt', index_col=False, header=0, squeeze=True)

But both just won't work: the first one gives a random result, and the second just imports a DataFrame without squeezing.

It seems like pandas can only recognize as a Series a CSV formatted as follows:

f1, value
f2, value2
f3, value3

But when the features keys are in the first row instead of column, pandas does not want to squeeze it.

Is there something else I can try? Is this behaviour intended?

Answer

gaborous picture gaborous · Apr 2, 2013

Here is the way I've found:

df = pandas.read_csv('csvfile.txt', index_col=False, header=0);
serie = df.ix[0,:]

Seems like a bit stupid to me as Squeeze should already do this. Is this a bug or am I missing something?

/EDIT: Best way to do it:

df = pandas.read_csv('csvfile.txt', index_col=False, header=0);
serie = df.transpose()[0] # here we convert the DataFrame into a Serie

This is the most stable way to get a row-oriented CSV line into a pandas Series.

BTW, the squeeze=True argument is useless for now, because as of today (April 2013) it only works with row-oriented CSV files, see the official doc:

http://pandas.pydata.org/pandas-docs/dev/io.html#returning-series