TypeError: unhashable type

qurious picture qurious · Dec 11, 2015 · Viewed 14.2k times · Source

I wrote a small piece of code to do linear regression using sklearn.

I created a 2 column csv file (column names X,Y with some numbers) and when I read the file I see that the content is properly read - as shown below.

However, I am getting "unhashable type" error when I try to refer to a column using the commands datafile[:,:] or datafile[:,-1] etc..

And when I try to use X as response, Y as predictor in sklearn's linear regression, I am getting Value error as shown below.

I looked online but not able to figure out what is wrong with my code or file. Please help.

import pandas as pd
datafile=pd.read_csv('samplelinear.csv')
datafile

     X    Y    
0    0 1.440000 
1    1 33.220000 
. . . 

print datafile.__class__
<class 'pandas.core.frame.DataFrame'>

datafile[:,:]
TypeError: unhashable type

datafile[:,:1]
TypeError: unhashable type


from sklearn.linear_model import LinearRegression
model=LinearRegression()

model.fit(datafile.X,datafile.Y)
ValueError: Found arrays with inconsistent numbers of samples: [ 1 14]

Answer

maxymoo picture maxymoo · Dec 11, 2015

If you want to use the slice syntax to select from a dataframe you have to use

data.iloc[:,:1]

For your second problem, the X input needs to be a matrix, not a vector, so either include more columns or use the syntax:

model.fit(pd.DataFrame(datafile.X), datafile.Y)