Which is the loss function for multi-class classification in XGBoost?

Anabel Gómez picture Anabel Gómez · Feb 1, 2017 · Viewed 8.7k times · Source

I'm trying to know which loss function uses XGBoost for multi-class classification. I found in this question the loss function for logistic classification in the binary case.

I had though that for the multi-class case it might be the same as in GBM (for K classes) which can be seen here, where y_k=1 if x's label is k and 0 in any other case, and p_k(x) is the softmax function. However, I have made the first and second order gradient using this loss function and the hessian doesn't match the one defined in the code here (in function GetGradient in SoftmaxMultiClassObj) by a constant 2.

Could you please tell me which is the loss function used?

Thank you in advance.

Answer

ashlaban picture ashlaban · May 15, 2018

The loss function used for multiclass is, as you suspect, the softmax objective function. As of now the only options for multiclass are shown in the quote below, the multi:softprob returning all probabilities instead of just those of the most likely class.

“multi:softmax” –set XGBoost to do multiclass classification using the softmax objective, you also need to set num_class(number of classes)

“multi:softprob” –same as softmax, but output a vector of ndata * nclass, which can be further reshaped to ndata, nclass matrix. The result contains predicted probability of each data point belonging to each class.

See https://xgboost.readthedocs.io/en/latest//parameter.html#learning-task-parameters.