I have a CSV file formatted as follows:
somefeature,anotherfeature,f3,f4,f5,f6,f7,lastfeature
0,0,0,1,1,2,4,5
And I try to read it as a pandas Series (using pandas daily snapshot for Python 2.7). I tried the following:
import pandas as pd
types = pd.Series.from_csv('csvfile.txt', index_col=False, header=0)
and:
types = pd.read_csv('csvfile.txt', index_col=False, header=0, squeeze=True)
But both just won't work: the first one gives a random result, and the second just imports a DataFrame without squeezing.
It seems like pandas can only recognize as a Series a CSV formatted as follows:
f1, value
f2, value2
f3, value3
But when the features keys are in the first row instead of column, pandas does not want to squeeze it.
Is there something else I can try? Is this behaviour intended?
Here is the way I've found:
df = pandas.read_csv('csvfile.txt', index_col=False, header=0);
serie = df.ix[0,:]
Seems like a bit stupid to me as Squeeze should already do this. Is this a bug or am I missing something?
/EDIT: Best way to do it:
df = pandas.read_csv('csvfile.txt', index_col=False, header=0);
serie = df.transpose()[0] # here we convert the DataFrame into a Serie
This is the most stable way to get a row-oriented CSV line into a pandas Series.
BTW, the squeeze=True argument is useless for now, because as of today (April 2013) it only works with row-oriented CSV files, see the official doc:
http://pandas.pydata.org/pandas-docs/dev/io.html#returning-series