[PYTHON] Take a peek at the processing of LightGBM Tuner

Introduction

Announcement at PyData.Tokyo at the end of September last year ? ref = https://pydatatokyo.connpass.com/event/141272/presentation/) The LightGBMTuner introduced in) has finally been implemented.

-[Automatic optimization of hyperparameters by LightGBM Tuner, an extension of Optuna \ | Preferred Networks Research & Development](https://tech.preferred.jp/ja/blog/hyperparameter-tuning-with-optuna-integration-lightgbm-tuner /)

Not limited to LightGBM, model parameters are not independent but interact with each other. Therefore, higher accuracy can be expected by tuning each parameter step by step at the time of parameter tuning. The concept is to tune in order from the parameters that have the greatest influence (which seems to be).

Confirmation environment

Processing content

The implementation is here

No. Contents Method name Parameter name Tuning range Number of trials
1 feature_fraction(First time) tune_feature_fraction() feature_fraction 0.4~1.0 7 times
2 num leaves tune_feature_fraction() num leaves 0~1(optuna.samplers.Use TPESampler) 20 times
3 bagging tune_bagging() bagging_fraction, bagging_freq 0~1(optuna.samplers.Use TPESampler) 10 times
4 feature_fraction(Second time) tune_feature_fraction_stage2() feature_fraction First optimum value ± 0.Range of 08 (0.4~1.Excluding values outside the 0 range) 3~6 times
5 regularization tune_regularization_factors() lambda_l1, lambda_l2 0~1(optuna.samplers.Use TPESampler) 20 times
6 min data in leaf tune_min_data_in_leaf() min_child_samples 5, 10, 25, 50, 100 5 times

The points I was interested in are as follows.

--What is the effect of tuning feature_fraction in two steps? --Is it feature_fraction before num leaves? --The tuning of lambda_l1 and lambda_l2 is narrowed to 0 to 1 (I used to take the maximum value of 100, but is it too wide?)

If you want to change the tuning content to your liking, you currently need to implement run () with a monkey patch.

Supplement

If you don't understand the meaning of the parameters, start by understanding your feelings at the link below. [Feelings of important parameters in gradient boosting -nykergoto's blog](https://nykergoto.hatenablog.jp/entry/2019/03/29/%E5%8B%BE%E9%85%8D%E3%83 % 96% E3% 83% BC% E3% 82% B9% E3% 83% 86% E3% 82% A3% E3% 83% B3% E3% 82% B0% E3% 81% A7% E5% A4% A7 % E4% BA% 8B% E3% 81% AA% E3% 83% 91% E3% 83% A9% E3% 83% A1% E3% 83% BC% E3% 82% BF% E3% 81% AE% E6 % B0% 97% E6% 8C% 81% E3% 81% A1)

Recommended Posts

Take a peek at the processing of LightGBM Tuner
The story of blackjack A processing (python)
Let's take a look at the Scapy code. How are you processing the structure?
[GoLang] Set a space at the beginning of the comment
Take a screenshot of the LCD with Python-LEGO Mindstorms
Tasks at the start of a new python project
Take a closer look at the Kaggle / Titanic tutorial
Get UNIXTIME at the beginning of today with a command
Image crawling summary performed at the speed of a second
Take a look at the Python built-in exception tree structure
[Go] Take a look at io.Writer
Take a look at Django's template.
Take the execution log of Celery
A function that measures the processing time of a method in python
Gold needle for when it becomes a stone by looking at the formula of image processing
Take a look at the built-in exception tree structure in Python 3.8.2
The story of writing a program
Let's take a look at the forest fire on the west coast of the United States with satellite images.
Let's take a look at the Scapy code. Overload of special methods __div__, __getitem__ and so on.
What is the XX file at the root of a popular Python project?
[Error] The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0
How to put a line number at the beginning of a CSV file
Receive a list of the results of parallel processing in Python with starmap
Get a datetime instance at any time of the day in Python
Python: I want to measure the processing time of a function neatly
python> print> Redirected only at the end of processing?> Run with -u
Shout Hello, Reiwa! At the beginning of Reiwa
A quick overview of the Linux kernel
Take the logarithm of the nonzero element of scipy.sparse
[python] [meta] Is the type of python a type?
A memo explaining the axis specification of axis
Python Basic Course (at the end of 15)
Get the filename of a directory (glob)
Notice the completion of a time-consuming command
The story of IPv6 address that I want to keep at a minimum
3. Natural language processing with Python 3-3. A year of corona looking back at TF-IDF
Get a lot of Twitter tweets at once
Get the caller of a function in Python
About the processing speed of SVM (SVC) of scikit-learn
Make a copy of the list in Python
Find the number of days in a month
A note about the python version of python virtualenv
Send Gmail at the end of the process [Python]
Image processing? The story of starting Python for
The story of making a lie news generator
I checked the processing speed of numpy one-dimensionalization
[At Coder] Solve the problem of binary search
Calculate the probability of outliers on a boxplot
[Python] A rough understanding of the logging module
Output in the form of a python array
At the time of python update on ubuntu
The story of making a mel icon generator
Remove specific strings at the end of python
About the behavior of Queue during parallel processing
A discussion of the strengths and weaknesses of Python
An easy way to measure the processing speed of a disk recognized by Linux
[Python3] Take a screenshot of a web page on the server and crop it further
I want to take a screenshot of the site on Docker using any font
Collect cat images at the speed of a second and aim for the Cat Hills tribe
I took a look at the contents of sklearn (scikit-learn) (1) ~ What about the implementation of CountVectorizer? ~
Don't take an instance of a Python exception class directly as an argument to the exception class!