Getting Error on StandardScalar Fit_Transform

Vikas Kyatannawar picture Vikas Kyatannawar · Dec 6, 2017 · Viewed 12.6k times · Source
 import numpy as np
 import matplotlib.pyplot as plt
 import pandas as pd

 dataset = pd.read_csv('Position_Salaries.csv')
 X = dataset.iloc[:, 1:2].values
 y = dataset.iloc[:, 2].values

from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_y = StandardScaler()
X = sc_X.fit_transform(X)
y = sc_y.fit_transform(y)

Ok so here is the problem. both X and y are single feature and have one column. As you can see X is a matrix. and y is a vector X = dataset.iloc[:, 1:2].values y = dataset.iloc[:, 2].values

Now when I run y = sc_y.fit_transform(y) I get the error that it is a 1D array. And if i change y = dataset.iloc[:, 2:3].values making it a 2D array. But I want it to stay as 1D array since its the dependent variable and want it to stay that way. Also i solved earlier different examples where I had to rescale similar data, and it did not give me this kind of error. Not sure why it is giving me now. Moreover i am watching a video while coding and in the video everything is the same but he doesn't get any error.

Answer

Vivek Kumar picture Vivek Kumar · Dec 6, 2017

StandardScaler is meant to work on the features, not labels or target data. Hence only works on 2-d Data. Please see here for documentation:

What you can do is, use scale function. StandardScaler is just a wrapper over this function.

from sklearn.preprocessing import scale
y = scale(y)

Or if you want to use StandarScaler, you need to reshape your y to a 2-d array like this:

import numpy as np
y = np.array(y).reshape(-1,1)
y = sc_y.fit_transform(y)