[PYTHON] About max_iter of LogisticRegression () of scikit-learn

Origin

I'm currently taking udemy's "Machine Learning with python: An Introduction to Identification Learned with scikit-learn". Since sample code is distributed for each theme in this course, I am grateful that I do not have to write it, but a warning message appears.

environment

jupyter-lab:1.2.6 python:3.7.7 scikit-learn:0.22.1

Problem area

from sklearn import linear_model
clf = linear_model.LogisticRegression()
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import LeaveOneOut
loocv = LeaveOneOut()
scores = cross_val_score(clf, X, y,cv=loocv)

When I ran

ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)

And warning. Moreover, many times on the scale of thousands of lines.

Follow the warning and use the same warning text as Logistic Regression (max_iter = 1000 ^ 4000). When I checked it, it was said that it would repeat until it converged with max_iter = -1, so when I passed -1 as an argument and executed it, max_iter gets angry at a positive number. If there is no choice but to set max_iter = 1000 ^ 5000, the process will not end. When someone else does it, it takes less than a second to learn, so maybe something is wrong with my environment, but I don't know.

In the first place, it is not an error statement but a warning statement, so it may be learning repeatedly up to the specified number of times, but it would be a problem if thousands of lines of warning statements were issued each time. In addition, X and y are each less than 1MB in size, and I tried it with google's colab, but the same result was obtained.

If anyone has any idea, I would appreciate it if you could tell me.

Recommended Posts

About max_iter of LogisticRegression () of scikit-learn
About the processing speed of SVM (SVC) of scikit-learn
About all of numpy
About assignment of numpy.ndarray
About MultiIndex of pandas
About variable of chainer
Consistency of scikit-learn API design
About Japanese path of pyminizip
About the ease of Python
About Japanese support of cometchat
About various encodings of Python 3
About all of numpy (2nd)
Parallel processing with Parallel of scikit-learn
python: Basics of using scikit-learn ①
About cost calculation of MeCab
About approximate fractions of pi
About the components of Luigi
About HOG output of Scikit-Image
About the features of Python
About data management of anvil-app-server
Grid search of hyperparameters with Scikit-learn
[Translation] scikit-learn 0.18 Tutorial Table of Contents
Installation of scikit-learn (Mac OS X)
About the return value of pthread_mutex_init ()
About the return value of the histogram.
About the basic type of Go
About the upper limit of threads-max
About circular crossover of genetic algorithms
About the behavior of yield_per of SqlAlchemy
About import error of PyQt5.QtWidgets (Anaconda)
About the size of matplotlib points
About color halftone processing of images
About the basics list of Python basics