Reproduce LightGBM Objective with python

One of the features of Kaggle's popular library, LightGBM, is that it optimizes nicely by giving the gradient of the objective function you want to minimize. Choosing this objective function properly is one of the tricks to make a good model, and by default lots of objective is implemented. However, LightGBM allows you to pass a function to calculate the gradient in python. In this article, I would like to introduce what kind of implementation is done by creating an equivalent objective with python while referring to the official implementation of LightGBM.

Prerequisites

In this article, it is assumed that the regression / two-class classification is performed by passing lgb.Dataset to lgb.train. When creating an objective for multi-class classification, [Mr. Tawara's article (forcibly perform Multi-Task (ha ???) Regression with LightGBM)](https://tawara.hatenablog.com/entry/2020/05/ 14/120016) I think it's easy to understand if you read around. Also, I'm not sure why I want grad and hess, so please refer to other materials around that.

Imitate the official behavior

When objective = "l2"

Will begin the main subject. Since the core part of lightGBM is implemented in C ++ for speeding up, the Objective part is also written in C ++. Read the code (https://github.com/microsoft/LightGBM/blob/master/src/objective/regression_objective.hpp) for the behavior when objective = "l2". The part that calculates the gradient is implemented in GetGradients ().

cpp:github.com/microsoft/LightGBM/blob/master/src/objective/regression_objective.hpp


  void GetGradients(const double* score, score_t* gradients,
                    score_t* hessians) const override {
    if (weights_ == nullptr) {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        gradients[i] = static_cast<score_t>(score[i] - label_[i]);
        hessians[i] = 1.0f;
      }
    } else {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        gradients[i] = static_cast<score_t>((score[i] - label_[i]) * weights_[i]);
        hessians[i] = static_cast<score_t>(weights_[i]);
      }
    }
  }

It's not practical because it's slow, but if you reproduce this in python, it will look like this.

def l2_loss(pred, data):
    true = data.get_label()
    grad = pred - true
    hess = np.ones(len(grad))   
    return grad, hess

When objective = "poisson"

This objective minimizes poisson loss. When I read the metric about poisson loss, it is as follows.

cpp:github.com/microsoft/LightGBM/blob/master/src/metric/regression_metric.hpp


class PoissonMetric: public RegressionMetric<PoissonMetric> {
 public:
  explicit PoissonMetric(const Config& config) :RegressionMetric<PoissonMetric>(config) {
  }

  inline static double LossOnPoint(label_t label, double score, const Config&) {
    const double eps = 1e-10f;
    if (score < eps) {
      score = eps;
    }
    return score - label * std::log(score);
  }
  inline static const char* Name() {
    return "poisson";
  }
};

And when I read objective, it has the following implementation.

cpp:github.com/microsoft/LightGBM/blob/master/src/objective/regression_objective.hpp


  void GetGradients(const double* score, score_t* gradients,
                    score_t* hessians) const override {
    if (weights_ == nullptr) {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        gradients[i] = static_cast<score_t>(std::exp(score[i]) - label_[i]);
        hessians[i] = static_cast<score_t>(std::exp(score[i] + max_delta_step_));
      }
    } else {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        gradients[i] = static_cast<score_t>((std::exp(score[i]) - label_[i]) * weights_[i]);
        hessians[i] = static_cast<score_t>(std::exp(score[i] + max_delta_step_) * weights_[i]);
      }
    }
  }

... did you notice? Actually, the score of this objective is not the predicted value as it is, but the value of the exponent part x of e when it is expressed by score = e ^ x. Try entering the formula in WolframAlpha You can see. Therefore, when you make a poisson objective (others such as gamma and tweedie) with objective, you have to calculate metric with predicted value = e ^ (pred).

def poisson_metric(pred, data):
    true = data.get_label()
    loss = np.exp(pred) - true*pred
    return "poisson", np.mean(loss), False

def poisson_object(pred, data):
    poisson_max_delta_step = 0.7
    true = data.get_label()
    grad = np.exp(pred) - true
    hess = exp(pred + poisson_max_delta_step)
    return grad, hess

When objective = "binary"

I would like to see the objective at the time of Niclas classification by pushing it badly. The metric at the time of binary is as follows.

cpp:github.com/microsoft/LightGBM/blob/master/src/metric/binary_metric.hpp


class BinaryLoglossMetric: public BinaryMetric<BinaryLoglossMetric> {
 public:
  explicit BinaryLoglossMetric(const Config& config) :BinaryMetric<BinaryLoglossMetric>(config) {}

  inline static double LossOnPoint(label_t label, double prob) {
    if (label <= 0) {
      if (1.0f - prob > kEpsilon) {
        return -std::log(1.0f - prob);
      }
    } else {
      if (prob > kEpsilon) {
        return -std::log(prob);
      }
    }
    return -std::log(kEpsilon);
  }

  inline static const char* Name() {
    return "binary_logloss";
  }
};

Note that objective is sigmoid = 1, label_val = [-1, 1], label_weights = [1, 1] when is_unbalance = False.

cpp:github.com/microsoft/LightGBM/blob/master/src/objective/binary_objective.hpp


  void GetGradients(const double* score, score_t* gradients, score_t* hessians) const override {
    if (!need_train_) {
      return;
    }
    if (weights_ == nullptr) {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        // get label and label weights
        const int is_pos = is_pos_(label_[i]);
        const int label = label_val_[is_pos];
        const double label_weight = label_weights_[is_pos];
        // calculate gradients and hessians
        const double response = -label * sigmoid_ / (1.0f + std::exp(label * sigmoid_ * score[i]));
        const double abs_response = fabs(response);
        gradients[i] = static_cast<score_t>(response * label_weight);
        hessians[i] = static_cast<score_t>(abs_response * (sigmoid_ - abs_response) * label_weight);
      }
    } else {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        // get label and label weights
        const int is_pos = is_pos_(label_[i]);
        const int label = label_val_[is_pos];
        const double label_weight = label_weights_[is_pos];
        // calculate gradients and hessians
        const double response = -label * sigmoid_ / (1.0f + std::exp(label * sigmoid_ * score[i]));
        const double abs_response = fabs(response);
        gradients[i] = static_cast<score_t>(response * label_weight  * weights_[i]);
        hessians[i] = static_cast<score_t>(abs_response * (sigmoid_ - abs_response) * label_weight * weights_[i]);
      }
    }
  }

As in the case of poisson, the score is predicted value = sigmoid (score), so the gradient is like this. Checking with WolframAlpha as before [when label = 0](https://ja.wolframalpha.com/input/?i=d%2Fdx+log%281+-+%281%2F%281+%2B + e% 5E% 28-x% 29% 29% 29% 29), when label = 1 % 2F% 281 +% 2B + e% 5E% 28-x% 29% 29% 29% 29), so if you write objective in python, it will be as follows.

def binary_metric(pred, data):
    true = data.get_label()
    loss = -(true * np.log(1/(1+np.exp(-pred))) + (1 - true) * np.log(1 - 1/(1+np.exp(-pred))))
    return "binary", np.mean(loss), False

def binary_objective(pred, data):
    true = data.get_label()
    label = 2*true - 1
    response = -label / (1 + np.exp(label * pred))
    abs_response = np.abs(response)
    grad = response
    hess = abs_response * (1 - abs_response)
    return grad, hess

in conclusion

This time I reproduced the official implementation of lightGBM in python. Understanding the basics introduced this time will make it easier to create your own custom objective. I would like to introduce the objective that I implemented in the competition in another article, so please do not hesitate to contact me.

Recommended Posts

Reproduce LightGBM Objective with python
FizzBuzz with Python3
Try to reproduce color film with Python
Scraping with Python
Statistics with python
Scraping with Python
Python with Go
Twilio with Python
Integrate with Python
Play with 2016-Python
AES256 with python
Tested with Python
python starts with ()
with syntax (Python)
Bingo with python
Zundokokiyoshi with python
Excel with Python
Microcomputer with Python
Cast with python
Serial communication with Python
Zip, unzip with python
Django 1.11 started with Python3.6
Primality test with Python
Socket communication with Python
Data analysis with python 2
Try scraping with Python.
Learning Python with ChemTHEATER 03
Sequential search with Python
"Object-oriented" learning with python
Run Python with VBA
Handling yaml with python
Solve AtCoder 167 with python
Serial communication with python
[Python] Use JSON with Python
Learning Python with ChemTHEATER 05-1
Learn Python with ChemTHEATER
Run prepDE.py with python3
1.1 Getting Started with Python
Collecting tweets with Python
Binarization with OpenCV / Python
3. 3. AI programming with Python
Kernel Method with Python
Non-blocking with Python + uWSGI
Scraping with Python + PhantomJS
Posting tweets with python
Drive WebDriver with python
Use mecab with Python3
[Python] Redirect with CGIHTTPServer
Voice analysis with python
Think yaml with python
Operate Kinesis with Python
Getting Started with Python
Use DynamoDB with Python
Zundko getter with python
Handle Excel with python
Ohm's Law with Python
Primality test with python
Run Blender with python
Solve Sudoku with Python
Python starting with Windows 7
Heatmap with Python + matplotlib