One of the features of Kaggle's popular library, LightGBM, is that it optimizes nicely by giving the gradient of the objective function you want to minimize. Choosing this objective function properly is one of the tricks to make a good model, and by default lots of objective is implemented. However, LightGBM allows you to pass a function to calculate the gradient in python. In this article, I would like to introduce what kind of implementation is done by creating an equivalent objective with python while referring to the official implementation of LightGBM.

Prerequisites

In this article, it is assumed that the regression / two-class classification is performed by passing lgb.Dataset to lgb.train. When creating an objective for multi-class classification, [Mr. Tawara's article (forcibly perform Multi-Task (ha ???) Regression with LightGBM)](https://tawara.hatenablog.com/entry/2020/05/ 14/120016) I think it's easy to understand if you read around. Also, I'm not sure why I want grad and hess, so please refer to other materials around that.

Imitate the official behavior

When objective = "l2"

Will begin the main subject. Since the core part of lightGBM is implemented in C ++ for speeding up, the Objective part is also written in C ++. Read the code (https://github.com/microsoft/LightGBM/blob/master/src/objective/regression_objective.hpp) for the behavior when objective = "l2". The part that calculates the gradient is implemented in GetGradients ().

`cpp:github.com/microsoft/LightGBM/blob/master/src/objective/regression_objective.hpp`


  void GetGradients(const double* score, score_t* gradients,
                    score_t* hessians) const override {
    if (weights_ == nullptr) {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        gradients[i] = static_cast<score_t>(score[i] - label_[i]);
        hessians[i] = 1.0f;
      }
    } else {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        gradients[i] = static_cast<score_t>((score[i] - label_[i]) * weights_[i]);
        hessians[i] = static_cast<score_t>(weights_[i]);
      }
    }
  }

It's not practical because it's slow, but if you reproduce this in python, it will look like this.

def l2_loss(pred, data):
    true = data.get_label()
    grad = pred - true
    hess = np.ones(len(grad))   
    return grad, hess

When objective = "poisson"

This objective minimizes poisson loss. When I read the metric about poisson loss, it is as follows.

`cpp:github.com/microsoft/LightGBM/blob/master/src/metric/regression_metric.hpp`


class PoissonMetric: public RegressionMetric<PoissonMetric> {
 public:
  explicit PoissonMetric(const Config& config) :RegressionMetric<PoissonMetric>(config) {
  }

  inline static double LossOnPoint(label_t label, double score, const Config&) {
    const double eps = 1e-10f;
    if (score < eps) {
      score = eps;
    }
    return score - label * std::log(score);
  }
  inline static const char* Name() {
    return "poisson";
  }
};

And when I read objective, it has the following implementation.

`cpp:github.com/microsoft/LightGBM/blob/master/src/objective/regression_objective.hpp`


  void GetGradients(const double* score, score_t* gradients,
                    score_t* hessians) const override {
    if (weights_ == nullptr) {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        gradients[i] = static_cast<score_t>(std::exp(score[i]) - label_[i]);
        hessians[i] = static_cast<score_t>(std::exp(score[i] + max_delta_step_));
      }
    } else {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        gradients[i] = static_cast<score_t>((std::exp(score[i]) - label_[i]) * weights_[i]);
        hessians[i] = static_cast<score_t>(std::exp(score[i] + max_delta_step_) * weights_[i]);
      }
    }
  }

... did you notice? Actually, the score of this objective is not the predicted value as it is, but the value of the exponent part x of e when it is expressed by score = e ^ x. Try entering the formula in WolframAlpha You can see. Therefore, when you make a poisson objective (others such as gamma and tweedie) with objective, you have to calculate metric with predicted value = e ^ (pred).

def poisson_metric(pred, data):
    true = data.get_label()
    loss = np.exp(pred) - true*pred
    return "poisson", np.mean(loss), False

def poisson_object(pred, data):
    poisson_max_delta_step = 0.7
    true = data.get_label()
    grad = np.exp(pred) - true
    hess = exp(pred + poisson_max_delta_step)
    return grad, hess

When objective = "binary"

I would like to see the objective at the time of Niclas classification by pushing it badly. The metric at the time of binary is as follows.

`cpp:github.com/microsoft/LightGBM/blob/master/src/metric/binary_metric.hpp`


class BinaryLoglossMetric: public BinaryMetric<BinaryLoglossMetric> {
 public:
  explicit BinaryLoglossMetric(const Config& config) :BinaryMetric<BinaryLoglossMetric>(config) {}

  inline static double LossOnPoint(label_t label, double prob) {
    if (label <= 0) {
      if (1.0f - prob > kEpsilon) {
        return -std::log(1.0f - prob);
      }
    } else {
      if (prob > kEpsilon) {
        return -std::log(prob);
      }
    }
    return -std::log(kEpsilon);
  }

  inline static const char* Name() {
    return "binary_logloss";
  }
};

Note that objective is sigmoid = 1, label_val = [-1, 1], label_weights = [1, 1] when is_unbalance = False.

`cpp:github.com/microsoft/LightGBM/blob/master/src/objective/binary_objective.hpp`


  void GetGradients(const double* score, score_t* gradients, score_t* hessians) const override {
    if (!need_train_) {
      return;
    }
    if (weights_ == nullptr) {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        // get label and label weights
        const int is_pos = is_pos_(label_[i]);
        const int label = label_val_[is_pos];
        const double label_weight = label_weights_[is_pos];
        // calculate gradients and hessians
        const double response = -label * sigmoid_ / (1.0f + std::exp(label * sigmoid_ * score[i]));
        const double abs_response = fabs(response);
        gradients[i] = static_cast<score_t>(response * label_weight);
        hessians[i] = static_cast<score_t>(abs_response * (sigmoid_ - abs_response) * label_weight);
      }
    } else {
      #pragma omp parallel for schedule(static)
      for (data_size_t i = 0; i < num_data_; ++i) {
        // get label and label weights
        const int is_pos = is_pos_(label_[i]);
        const int label = label_val_[is_pos];
        const double label_weight = label_weights_[is_pos];
        // calculate gradients and hessians
        const double response = -label * sigmoid_ / (1.0f + std::exp(label * sigmoid_ * score[i]));
        const double abs_response = fabs(response);
        gradients[i] = static_cast<score_t>(response * label_weight  * weights_[i]);
        hessians[i] = static_cast<score_t>(abs_response * (sigmoid_ - abs_response) * label_weight * weights_[i]);
      }
    }
  }

As in the case of poisson, the score is predicted value = sigmoid (score), so the gradient is like this. Checking with WolframAlpha as before [when label = 0](https://ja.wolframalpha.com/input/?i=d%2Fdx+log%281+-+%281%2F%281+%2B + e% 5E% 28-x% 29% 29% 29% 29), when label = 1 % 2F% 281 +% 2B + e% 5E% 28-x% 29% 29% 29% 29), so if you write objective in python, it will be as follows.

def binary_metric(pred, data):
    true = data.get_label()
    loss = -(true * np.log(1/(1+np.exp(-pred))) + (1 - true) * np.log(1 - 1/(1+np.exp(-pred))))
    return "binary", np.mean(loss), False

def binary_objective(pred, data):
    true = data.get_label()
    label = 2*true - 1
    response = -label / (1 + np.exp(label * pred))
    abs_response = np.abs(response)
    grad = response
    hess = abs_response * (1 - abs_response)
    return grad, hess

in conclusion

This time I reproduced the official implementation of lightGBM in python. Understanding the basics introduced this time will make it easier to create your own custom objective. I would like to introduce the objective that I implemented in the competition in another article, so please do not hesitate to contact me.

Reproduce LightGBM Objective with python

Prerequisites

Imitate the official behavior

When objective = "l2"

cpp:github.com/microsoft/LightGBM/blob/master/src/objective/regression_objective.hpp

When objective = "poisson"

cpp:github.com/microsoft/LightGBM/blob/master/src/metric/regression_metric.hpp

cpp:github.com/microsoft/LightGBM/blob/master/src/objective/regression_objective.hpp

When objective = "binary"

cpp:github.com/microsoft/LightGBM/blob/master/src/metric/binary_metric.hpp

cpp:github.com/microsoft/LightGBM/blob/master/src/objective/binary_objective.hpp

in conclusion

`cpp:github.com/microsoft/LightGBM/blob/master/src/objective/regression_objective.hpp`

`cpp:github.com/microsoft/LightGBM/blob/master/src/metric/regression_metric.hpp`

`cpp:github.com/microsoft/LightGBM/blob/master/src/objective/regression_objective.hpp`

`cpp:github.com/microsoft/LightGBM/blob/master/src/metric/binary_metric.hpp`

`cpp:github.com/microsoft/LightGBM/blob/master/src/objective/binary_objective.hpp`