[PYTHON] Note that the Logistic Regression solver has changed its default value to lbfgs.

Introduction

The scikit-learn library's Logistic Regression (https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) (LogisticRegression) defaults to solver in version 0.22. Changed from liblineartolbfgs`.

Note that due to this change, it is expected that the execution result will be different from the past or an error will be output even with the same code.

For example, such an event

I get an error with the following code that does L1 normalization.

lr_l1 = LogisticRegression(C=C, penalty='l1').fit(X_train, y_train)

The following is the error content.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/devp/linear_model.py in 
      1 for C, marker in zip([0.001, 1, 100], ['o', '^', 'v']):
----> 2     lr_l1 = LogisticRegression(C=C, penalty='l1').fit(X_train, y_train)
      3     print('Training accuracy of l1 logreg with C={:.f3}: {:.2f}'.format(C, lr_l1.score(X_train, y_train)))
      4     print('Test accuracy of l1 logreg with C={:.f3}: {:.2f}'.format(C, lr_l1.score(X_test, y_test)))
      5     plt.plot(lr_l1.coef_.T, marker, label='C={:.3f}'.format(C))

/usr/local/lib/python3.7/site-packages/sklearn/linear_model/_logistic.py in fit(self, X, y, sample_weight)
   1484         The SAGA solver supports both float64 and float32 bit arrays.
   1485         """
-> 1486         solver = _check_solver(self.solver, self.penalty, self.dual)
   1487 
   1488         if not isinstance(self.C, numbers.Number) or self.C < 0:

/usr/local/lib/python3.7/site-packages/sklearn/linear_model/_logistic.py in _check_solver(solver, penalty, dual)
    443     if solver not in ['liblinear', 'saga'] and penalty not in ('l2', 'none'):
    444         raise ValueError("Solver %s supports only 'l2' or 'none' penalties, "
--> 445                          "got %s penalty." % (solver, penalty))
    446     if solver != 'liblinear' and dual:
    447         raise ValueError("Solver %s supports only "

ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

The error is that lbfgs in the solver only supports l2 or none.

Write solver to resolve the error

Describe as follows.

lr_l1 = LogisticRegression(C=C, penalty='l1', solver='liblinear').fit(X_train, y_train)

A little more about the cause

As stated in here, the default value of solver in LogisticRegression has been changed by the update. It's because of it.

Changed in version 0.22: The default solver changed from ‘liblinear’ to ‘lbfgs’ in 0.22.

As a result, assuming that the default value of solver is liblinear, the source code that omits the setting of solver while performing L1 normalization will be affected and an error will be thrown. ..

In addition, in the case of "I thought that the default value was liblinear and executed it, it was executed with lbfgs" because solver was omitted, the output result is in the past even if no error occurs. It becomes an event that is different from.

Solid conclusion

There are cases where "the operation changes or does not work due to the library side", so be sure to check the version upgrade of the library you use often.

Recommended Posts

Note that the Logistic Regression solver has changed its default value to lbfgs.
Note that the latest link of ius has changed
Note that the method of publishing modules to PyPI has changed in various ways.
Note that the Pandas loc specifications have changed
Code that sets the default value in case of AttributeError