I am trying to find the best parameters for a lightgbm
model using GridSearchCV
from sklearn.model_selection
. I have not been able to find a solution that actually works.
I have managed to set up a partly working code:
import numpy as np
import pandas as pd
import lightgbm as lgb
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import KFold
np.random.seed(1)
train = pd.read_csv('train.csv')
test = pd.read_csv('test.csv')
y = pd.read_csv('y.csv')
y = y.values.ravel()
print(train.shape, test.shape, y.shape)
categoricals = ['COL_A','COL_B']
indexes_of_categories = [train.columns.get_loc(col) for col in categoricals]
gkf = KFold(n_splits=5, shuffle=True, random_state=42).split(X=train, y=y)
param_grid = {
'num_leaves': [31, 127],
'reg_alpha': [0.1, 0.5],
'min_data_in_leaf': [30, 50, 100, 300, 400],
'lambda_l1': [0, 1, 1.5],
'lambda_l2': [0, 1]
}
lgb_estimator = lgb.LGBMClassifier(boosting_type='gbdt', objective='binary', num_boost_round=2000, learning_rate=0.01, metric='auc',categorical_feature=indexes_of_categories)
gsearch = GridSearchCV(estimator=lgb_estimator, param_grid=param_grid, cv=gkf)
lgb_model = gsearch.fit(X=train, y=y)
print(lgb_model.best_params_, lgb_model.best_score_)
This seems to be working but with a UserWarning
:
categorical_feature
keyword has been found inparams
and will be ignored. Please usecategorical_feature
argument of the Dataset constructor to pass this parameter.
I am looking for a working solution or perhaps a suggestion on how to ensure that lightgbm accepts categorical arguments in the above code
As the warning states, categorical_feature
is not one of the LGBMModel
arguments. It is relevant in lgb.Dataset
instantiation, which in the case of sklearn API is done directly in the fit()
method see the doc. Thus, in order to pass those in the GridSearchCV
optimisation one has to provide it as an argument of the GridSearchCV.fit()
method in the case of sklearn v0.19.1 or as an additional fit_params
argument in GridSearchCV
instantiation in older sklearn versions