understanding python xgboost cv

Question 1

understanding python xgboost cv

python cross-validation xgboost

kilojoules · Dec 26, 2015 · Viewed 31.8k times · Source

Answer

Answer

Sklearn GridSearchCV should be a way to go if you are looking for parameter tuning. You need to just pass the xgb classifier to GridSearchCV and evaluate on the best CV score.

here is nice tutorial which might help you getting started with parameter tuning: http://www.analyticsvidhya.com/blog/2016/03/complete-guide-parameter-tuning-xgboost-with-codes-python/

Question 2

I would like to use the xgboost cv function to find the best parameters for my training data set. I am confused by the api. How do I find the best parameter? Is this similar to the sklearn grid_search cross-validation function? How can I find which of the options for the max_depth parameter ([2,4,6]) was determined optimal?

from sklearn.datasets import load_iris
import xgboost as xgb
iris = load_iris()
DTrain = xgb.DMatrix(iris.data, iris.target)
x_parameters = {"max_depth":[2,4,6]}
xgb.cv(x_parameters, DTrain)
...
Out[6]: 
   test-rmse-mean  test-rmse-std  train-rmse-mean  train-rmse-std
0        0.888435       0.059403         0.888052        0.022942
1        0.854170       0.053118         0.851958        0.017982
2        0.837200       0.046986         0.833532        0.015613
3        0.829001       0.041960         0.824270        0.014501
4        0.825132       0.038176         0.819654        0.013975
5        0.823357       0.035454         0.817363        0.013722
6        0.822580       0.033540         0.816229        0.013598
7        0.822265       0.032209         0.815667        0.013538
8        0.822158       0.031287         0.815390        0.013508
9        0.822140       0.030647         0.815252        0.013494

understanding python xgboost cv

Answer

Related questions