In xgboost it is possible to set the parameter weight
for a DMatrix
. This is apparently a list of weights wherein each value is a weight for a corresponding sample.
I can't find any information on how these weights are actually used in the gradient boosting procedure. Are they related to eta
?
For example, if I would set weight
to 0.3 for all samples and eta
to 1, would this be the same as setting eta
to 0.3 and weight
to 1?
xgboost
allows for instance weighting during the construction of the DMatrix
, as you noted. This weight is directly tied the instance and travels with it throughout the entire training. Thus it is included in the calculations of the gradients and hessians, and directly impacts the split points and traing of an xgboost
model.
Instance Weight File
XGBoost supports providing each instance an weight to differentiate the importance of instances. For example, if we provide an instance weight file for the "train.txt" file in the example as below:
train.txt.weight
1
0.5
0.5
1
0.5
It means that XGBoost will emphasize more on the first and fourth instance, that is to say positive instances while training. The configuration is similar to configuring the group information. If the instance file name is "xxx", XGBoost will check whether there is a file named "xxx.weight" in the same directory and if there is, will use the weights while training models.
It is very different from eta
eta
simply tells xgboost
how much the blend the last tree trained into the ensemble. A measure of how greedy the ensemble should be at each iteration.
For example, if I would set
weight
to 0.3 for all samples andeta
to 1, would this be the same as settingeta
to 0.3 andweight
to 1?
A constant weight
of 1 for all instances is the default, so changing that to a constant of .3 for all instances would still be equal weighting, so this shouldn't impact things too much. However, setting eta
up to 1, from .3, would make the training much more aggressive.