AttributeError: 'SMOTE' object has no attribute '_validate_data'

HP_17 picture HP_17 · Jun 17, 2020 · Viewed 16.7k times · Source

I'm resampling my data (multiclass) by using SMOTE.

sm = SMOTE(random_state=1)
X_res, Y_res = sm.fit_resample(X_train, Y_train)

However, I'm getting this attribute error. Can anyone help?

Answer

VHS picture VHS · Jul 30, 2020

Short answer

You need to upgrade scikit-learn to version 0.23.1.

Long answer

The newest version 0.7.0 of imbalanced-learn seems to have an undocumented dependency on scikit-learn v0.23.1. It would give you AttributeError: 'SMOTE' object has no attribute '_validate_data' if your scikit-learnis 0.22 or below.

If you are using Anaconda, installing scikit-learn version 0.23.1 might be tricky. conda update scikit-learn might not update scikit-learn version 0.23 or higher because the newest scikit-learn version Conda has at this point of time is 0.22.1. If you try to install it using conda install scikit-learn=0.23.1 or pip install scikit-learn==0.23.1, you will get tons of compatibility checks and installation might not be quick. Therefore the easiest way to install scikit-learn version 0.23.1 in Anaconda is to create a new virtual environment with minimum packages so that there are less or no conflict issues. Then, in the new virtual environment install scikit-learn version 0.23.1 followed by version 0.7.0 of imbalanced-learn.

conda create -n test python=3.7.6
conda activate test
pip install scikit-learn==0.23.1
pip install imbalanced-learn==0.7.0

Finally, you need to reinstall your IDE in the new virtual environment in order to use these packages.

However, once scikit-learn version 0.23.1 becomes available in Conda and there are no compatibility issues, you can install it in the base environment directly.