Preprocessing in scikit learn - single sample - Depreciation warning

Chris Arthur picture Chris Arthur · Jan 29, 2016 · Viewed 94.5k times · Source

On a fresh installation of Anaconda under Ubuntu... I am preprocessing my data in various ways prior to a classification task using Scikit-Learn.

from sklearn import preprocessing

scaler = preprocessing.MinMaxScaler().fit(train)
train = scaler.transform(train)    
test = scaler.transform(test)

This all works fine but if I have a new sample (temp below) that I want to classify (and thus I want to preprocess in the same way then I get

temp = [1,2,3,4,5,5,6,....................,7]
temp = scaler.transform(temp)

Then I get a deprecation warning...

DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 
and will raise ValueError in 0.19. Reshape your data either using 
X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1)
if it contains a single sample. 

So the question is how should I be rescaling a single sample like this?

I suppose an alternative (not very good one) would be...

temp = [temp, temp]
temp = scaler.transform(temp)
temp = temp[0]

But I'm sure there are better ways.

Answer

Mike picture Mike · Jul 12, 2016

Just listen to what the warning is telling you:

Reshape your data either X.reshape(-1, 1) if your data has a single feature/column and X.reshape(1, -1) if it contains a single sample.

For your example type(if you have more than one feature/column):

temp = temp.reshape(1,-1) 

For one feature/column:

temp = temp.reshape(-1,1)