Premise

I'm a transcendental person, so I'll leave it as a memorandum. If you make a mistake, please kindly point it out because it's mental tofu A memorandum that makes the code of the reference site easy to understand The environment is azureml and I'm turning optuna to look for high para

Prerequisite knowledge

--num_boost_round is the number of gradient boosting iterations --Early_stopping ends the round when the prediction accuracy does not improve the specified number of times for validation. --Callback is a debug-like feature built into XGBoost (ambiguous) --Reference https://xgboost.readthedocs.io/en/latest/python/python_api.html#callback-api

Implementation

Minimal implementation

def return_callback():
    def print_num_boost_round(env):
        iteration = env.iteration
        msg = '\t'.join([str(x) for x in env.evaluation_result_list])
        print(iteration, msg)

As a result

0  ('validation_0-mae', 2657.650391)
1  ('validation_0-mae', 2657.609375)
0  ('validation_0-mae', 2624.649658)
2  ('validation_0-mae', 2657.425049)
1  ('validation_0-mae', 2624.609131)

You get something like Then change the code to

def return_callback():
    def print_num_boost_round(env):
        print(env)

XGBoostCallbackEnv(model=<xgboost.core.Booster object at 0x7fa972703208>, cvfolds=None, iteration=0, begin_iteration=0, end_iteration=100, rank=0, evaluation_result_list=[('validation_0-mae', 2657.623047)])
XGBoostCallbackEnv(model=<xgboost.core.Booster object at 0x7fa972703208>, cvfolds=None, iteration=1, begin_iteration=0, end_iteration=100, rank=0, evaluation_result_list=[('validation_0-mae', 2657.463379)])
XGBoostCallbackEnv(model=<xgboost.core.Booster object at 0x7f7a8224c208>, cvfolds=None, iteration=0, begin_iteration=0, end_iteration=100, rank=0, evaluation_result_list=[('validation_0-mae', 2624.622314)])
XGBoostCallbackEnv(model=<xgboost.core.Booster object at 0x7fa972703208>, cvfolds=None, iteration=2, begin_iteration=0, end_iteration=100, rank=0, evaluation_result_list=[('validation_0-mae', 2657.411377)])
XGBoostCallbackEnv(model=<xgboost.core.Booster object at 0x7f7a8224c208>, cvfolds=None, iteration=1, begin_iteration=0, end_iteration=100, rank=0, evaluation_result_list=[('validation_0-mae', 2624.467285)])
XGBoostCallbackEnv(model=<xgboost.core.Booster object at 0x7fa972703208>, cvfolds=None, iteration=3, begin_iteration=0, end_iteration=100, rank=0, evaluation_result_list=[('validation_0-mae', 2657.355957)])
XGBoostCallbackEnv(model=<xgboost.core.Booster object at 0x7f0ced02c208>, cvfolds=None, iteration=0, begin_iteration=0, end_iteration=100, rank=0, evaluation_result_list=[('validation_0-mae', 2639.834229)])
XGBoostCallbackEnv(model=<xgboost.core.Booster object at 0x7f7a8224c208>, cvfolds=None, iteration=2, begin_iteration=0, end_iteration=100, rank=0, evaluation_result_list=[('validation_0-mae', 2624.416016)])

It turns out that the value of iteration is obtained by env.iteration

Reference (https://kunsen.net/2020/05/02/post-3199/)

Try turning num_boost_round in Optuna to make a decision

param_list['num_boost_round'] = trial.suggest_int("num_boost_round", 100, 500)

First, try turning num_boost_round with the initial value of 100 to 500.

Specified parameters

"objective": "reg:gamma",
"eval_metric": "mae",
"verbosity": 0,
"booster": "gbtree",
"subsample": 1,
"subsample_freq": 0,
"early_stopping": 5,
"colsample_bylevel": 1,

List of parameters specified by Optuna

"min_child_weight": ""
"eta": "",
"lambda": "",
"alpha": "",
"num_leaves": "",
"colsample_bytree": "",
"num_boost_round": "",

If you turn it as it is


{
 'max_depth': 20,
 'eta': 0.22613771945050443,
 'num_leaves': 2560,
 'lambda': 6.0425529841148486e-05,
 'alpha': 6.69043393720362e-07,
 'num_boost_round': 236,
 'colsample_bytree': 0.9727432424922707,
 'min_child_weight': 239.6173703091301
}

num_boost_round is 236 (not the same every time because it's Optuna's whim) So what is 236 ... Is it going around 236 times in the first place (By the way, it was 253 when I executed it again) As a result output

0 ('validation_0-mae', 2657.650391)
1  ('validation_0-mae', 2657.609375)
0  ('validation_0-mae', 2624.649658)
2  ('validation_0-mae', 2657.425049)
1  ('validation_0-mae', 2624.609131)

Is output, but iteration only rotates up to 100 as end_iteration shows. Next, I searched for the minimum value (manual) Since 135.56956 was the minimum value, I counted the number of lines where that value appeared. The result is 482

Conclusion

If you look closely, just because the iterations are the same does not mean that the values are the same. It might have been easier to understand if I read the XGBoost paper and had it as prerequisite knowledge ... Is there no choice but to push it now ... ??

[PYTHON] I want to display the number of num_boost_rounds when early_stopping is applied using XGBoost callback (not achieved)