import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('Position_Salaries.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_y = StandardScaler()
X = sc_X.fit_transform(X)
y = sc_y.fit_transform(y)
Ok so here is the problem. both X and y are single feature and have one column. As you can see X is a matrix. and y is a vector X = dataset.iloc[:, 1:2].values y = dataset.iloc[:, 2].values
Now when I run y = sc_y.fit_transform(y)
I get the error that it is a 1D array. And if i change y = dataset.iloc[:, 2:3].values
making it a 2D array.
But I want it to stay as 1D array since its the dependent variable and want it to stay that way. Also i solved earlier different examples where I had to rescale similar data, and it did not give me this kind of error. Not sure why it is giving me now. Moreover i am watching a video while coding and in the video everything is the same but he doesn't get any error.
StandardScaler is meant to work on the features, not labels or target data. Hence only works on 2-d Data. Please see here for documentation:
What you can do is, use scale function. StandardScaler is just a wrapper over this function.
from sklearn.preprocessing import scale
y = scale(y)
Or if you want to use StandarScaler, you need to reshape your y
to a 2-d array like this:
import numpy as np
y = np.array(y).reshape(-1,1)
y = sc_y.fit_transform(y)