plot_decision_regions with error "Filler values must be provided when X has more than 2 training features."

Ramakrishna B picture Ramakrishna B · Oct 23, 2018 · Viewed 9.2k times · Source

I am plotting 2D plot for SVC Bernoulli output.

converted to vectors from Avg word2vec and standerdised data split data to train and test. Through grid search found the best C and gamma(rbf)

clf = SVC(C=100,gamma=0.0001)

clf.fit(X_train1,y_train)

from mlxtend.plotting import plot_decision_regions



plot_decision_regions(X_train, y_train, clf=clf, legend=2)


plt.xlabel(X.columns[0], size=14)
plt.ylabel(X.columns[1], size=14)
plt.title('SVM Decision Region Boundary', size=16)

Receive error :- ValueError: y must be a NumPy array. Found

also tried to convert the y to numpy. Then it prompts error ValueError: y must be an integer array. Found object. Try passing the array as y.astype(np.integer)

finally i converted it to integer array. Now it is prompting of error. ValueError: Filler values must be provided when X has more than 2 training features.

Answer

Vardan Agarwal picture Vardan Agarwal · Sep 26, 2019

You can use PCA to reduce your data multi-dimensional data to two dimensional data. Then pass the obtained result in plot_decision_region and there will be no need of filler values.

from sklearn.decomposition import PCA
from mlxtend.plotting import plot_decision_regions

clf = SVC(C=100,gamma=0.0001)
pca = PCA(n_components = 2)
X_train2 = pca.fit_transform(X_train)
clf.fit(X_train2, y_train)
plot_decision_regions(X_train2, y_train, clf=clf, legend=2)

plt.xlabel(X.columns[0], size=14)
plt.ylabel(X.columns[1], size=14)
plt.title('SVM Decision Region Boundary', size=16)