How to set up Random forest using Optuna

Fix 200704 If you think about it, max_depth, n_estimators etc. should have been given as discrete valuessuggest_discrete_uniform (name, low, high, q), so I modified the script. If max_depth, n_estimators, max_leaf_nodes are discrete values suggest_discrete_uniform, the type will be float, so change it to ʻint (suggest_discrete_uniform ()) and change the type to an integer type. I'm handing it over. Before correction Since I wrote how to use ʻOptuna last time, I will describe the individual setting method from now on. .. There are various arguments that can be passed in Randomforest, but I set all the main ones in ʻOptuna. I was wondering whether to pass max_depth, n_estimators` as integers or numbers as categories, but this time I passed them as integers.

def objective(trial):
    criterion = trial.suggest_categorical('criterion', ['mse', 'mae'])
    bootstrap = trial.suggest_categorical('bootstrap',['True','False'])
    max_depth = int(trial.suggest_discrete_uniform('max_depth', 10, 1000,10))
    max_features = trial.suggest_categorical('max_features', ['auto', 'sqrt','log2'])
    max_leaf_nodes = int(trial.suggest_discrete_uniform('max_leaf_nodes', 10, 1000,10))
    n_estimators =  int(trial.suggest_discrete_uniform('n_estimators', 10, 1000,10))
    regr = RandomForestRegressor(bootstrap = bootstrap, criterion = criterion,
                                 max_depth = max_depth, max_features = max_features,
                                 max_leaf_nodes = max_leaf_nodes,n_estimators = n_estimators,n_jobs=2)
    score = cross_val_score(regr, X_train, y_train, cv=3, scoring="r2")
    r2_mean = score.mean()
    return r2_mean
#Learn with optuna
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=1000)
#Fits tuned hyperparameters
optimised_rf = RandomForestRegressor(bootstrap = study.best_params['bootstrap'], criterion = study.best_params['criterion'],
                                     max_depth = int(study.best_params['max_depth']), max_features = study.best_params['max_features'],
                                     max_leaf_nodes = int(study.best_params['max_leaf_nodes']),n_estimators = int(study.best_params['n_estimators']),
optimised_rf.fit(X_train ,y_train)

I used this to tune hyperparameters using a Boston dataset. It fits nicely.


