[PYTHON] Summary of evaluation functions used in machine learning

Introduction

I had a vague understanding of the evaluation function (evaluation index), so I summarized the typical evaluation functions. We have briefly summarized the sample code when actually using what the evaluation function is, what each evaluation function means, and so on.

What is the evaluation function?

The evaluation function is a ** index that measures the goodness of the trained model **.

Difference from objective function

When studying machine learning, you will see various names such as objective function, loss function, and cost function. First, let's check the difference from the objective function.

--Objective function --Functions optimized for model training --Need to be able to differentiate

In other words, the objective function is optimized during learning, and the evaluation function is the index for confirming the goodness after learning.

The loss function, cost function, and error function are said to be part of the objective function. (There seems to be some controversy, but it seems that you can think of it as almost the same.)

Regression

First, I will summarize the evaluation function of the regression problem.

MAE(Mean Absolute Error)

Average absolute error. Represents the average of the absolute values of the difference between the true value and the predicted value.

--Evaluate the effect of outliers small

\textrm{MAE} = \frac{1}{N}\sum_{i=1}^{N}|y_{i}-\hat{y}_{i}|
N :Number of records\\
y_{i} :The true value of the i-th record\\
\hat{y}_{i} :Predicted value of the i-th record
from sklearn.metrics import mean_absolute_error
mean_absolute_error(y_true, y_pred)

MSE(Mean Squared Error)

Mean squared error. Represents the squared average of the difference between the true value and the predicted value.

--Great evaluation of the effects of outliers --The unit is the square of the objective variable

\textrm{MSE} = \frac{1}{N}\sum_{i=1}^{N}(y_{i}-\hat{y}_{i})^{2}
from sklearn.metrics import mean_squared_error
mean_squared_error(y_true, y_pred)

RMSE(Root Mean Squared Error)

Mean squared error. Represents the squared average of the difference between the true value and the predicted value.

--Great evaluation of the effects of outliers --Unit is the same as the objective variable

\textrm{RMSE} = \sqrt{\frac{1}{N}\sum_{i=1}^{N}(y_{i}-\hat{y}_{i})^{2}}

It cannot be calculated directly in the scikit-learn module, so it must be calculated from mean_squared_error.

from sklearn.metrics import mean_squared_error
from numpy as np
np.sqrt(mean_squared_error(y_true, y_pred))

RMSLE(Root Mean Squared Logarithmic Error)

It represents the average of the squared difference after taking each logarithm of the true value and the predicted value.

--Used for data with a wide range of possible values of the objective variable --Express the difference as a ratio

\textrm{RMSLE} = \sqrt{\frac{1}{N}\sum_{i=1}^{N}(\log (1+y_{i})-\log (1+\hat{y}_{i}))^{2}}
from sklearn.metrics import mean_squared_log_error
from numpy as np
np.sqrt(mean_squared_log_error(y_true, y_pred))

Coefficient of determination

\textrm{R}^{2}。 It shows the goodness of the regression analysis.

--Evaluable without depending on the scale of the objective variable --Take a value from 0 to 1

\textrm{R}^{2} = 1 - \frac{\sum_{i=1}^{N}(y_{i}-\hat{y_{i}})^{2}}{\sum_{i=1}^{N}(y_{i}-\bar{y_{i}})^{2}}
\bar{y} = \frac{1}{N}\sum_{i=1}^{N}y_{i}
from sklearn.metrics import r2_score
r2_score(y_true, y_pred)

Binary classification (when predicting positive or negative cases)

In the classification problem, we will summarize the evaluation functions handled in the problem of predicting whether it is a positive example or a negative example.

Mixed matrix

Although it is not an evaluation index, it is used as an evaluation index for predicting positive and negative cases, so I will explain it first.

The mixed matrix tabulates the following classification results.

--TP (True Positive): Correctly predict the positive example --TN (True Negative): Correctly predict negative examples --FP (False Positive): Predicted as a positive example --FN (False Negative): Falsely predicted as a negative example

from sklearn.metrics import confusion_matrix
confusion_matrix(y_true, y_pred)

It is output in the following format.

Predicted value is negative Predicted value is a positive example
True value is a negative example TN FP
True value is a positive example FN TP

Accuracy

Correct answer rate. The forecast represents the correct percentage of the overall forecast result.

Correct answer rate(\textrm{Accuracy}) = \frac{TP + TN}{TP + TN + FP + FN}
from sklearn.metrics import accuracy_score
accuracy_score(y_true, y_pred)

Precision

Conformity rate. Represents the percentage of correct examples predicted.

Compliance rate(\textrm{Precision}) = \frac{TP}{TP + FP}
from sklearn.metrics import precision_score
precision_score(y_true, y_pred)

Recall

Recall. Represents the percentage of correct examples predicted.

Recall(\textrm{Recall}) = \frac{TP}{TP + FN}
from sklearn.metrics import recall_score
recall_score(y_true, y_pred)

F1-score

F value. Represents an index that balances precision and recall.

It is calculated by the harmonic mean of precision and recall.

\textrm{F1-score} = \frac{2 \times \textrm{recall} \times \textrm{precision}}{\textrm{recall} + \textrm{precision}}
from sklearn.metrics import f1_score
f1_score(y_true, y_pred)

Binary classification (when predicting the probability of being a positive example)

Next, we will summarize the evaluation functions handled in the problem of predicting the probability, which is a positive example in the classification problem.

logloss

Also known as cross entropy. It shows how much the predicted probability distribution and the correct probability distribution are the same.

--Take a value from 0 to 1 ――It becomes smaller when you can predict correctly

\begin{align}
\textrm{logloss} &= -\frac{1}{N}\sum_{i=1}^{N}(y_{i}\log p_{i} + (1 - y_{i} \log (1 - p_{i}))) \\
&=-\frac{1}{N}\sum_{i=1}^{N} \log \acute{p}_{i}
\end{align}
y_{i} :Label indicating whether it is a correct example\\
p_{i} :Predictive probability that is a positive example\\
\acute{p}_{i} :Probability of predicting the true value
from sklearn.metrics import log_loss
log_loss(y_true, y_prob)

AUC

Represents the area at the bottom of the ROC curve.

--Random prediction is 0.5 --1.0 if all are predicted correctly --Used for classification with imbalanced data --Evaluation from the relationship between the predicted probability and the correct value (1 or 0)

from sklearn.metrics import roc_auc_score
roc_acu_score(y_true, y_prob)

What is the ROC curve?

It is a graph that plots the relationship between the true positive rate and the false positive rate when the threshold value with the predicted value as a positive example is moved from 0 to 1. It shows how the true positive rate and the false positive rate change when the threshold is changed.

--True positive rate: Percentage of all positive cases predicted to be positive --False positive rate: Percentage of all negative cases predicted to be positive

image.png

-(1.0, 1.0) are all correct examples and predicted -(0.0, 0.0) are all predicted to be negative cases -(0.0, 1.0) All above are predicted correctly --The straight line is a random prediction (AUC = 0.5)

Multi-class classification

Multi-class accuracy

It is an index that extends the accuracy of binary classification to multi-class classification. Represents the percentage of records that are correctly predicted.

from sklearn.metrics import accuracy_score
accuracy_score(y_true, y_pred)

mean-F1/macro-F1/micro-F1

It is an index that extends F1-score to multi-class classification.

--mean-F1: Average F1-score per record --macro-F1: Average F1-score for each class --micro-F1: Calculate TP / TN / FP / FN for each record x class pair to calculate F1-score

from sklearn.metrics import f1_score
f1_score(y_true, y_pred, average='samples')
f1_score(y_true, y_pred, average='macro')
f1_score(y_true, y_pred, average='micro')

Multi-class logloss

It is an index that extends the log loss of binary classification to multi-class classification.

\textrm{multiclass logloss} = -\frac{1}{N}\sum_{i=1}^{N}\sum_{m=1}^{M}y_{i,m}\log p_{i,m}
M :Number of classes\\
y_{i,m} :Label indicating whether record i belongs to class m\\
p_{i,m} :Predicted probability that record i belongs to class m\\
from sklearn.metrics import log_loss
log_loss(y_true, y_prob)

reference

Postscript

** Added on June 25, 2020 ** The mixed matrix table was incorrect and has been fixed. Thank you, @Fizunimo.

Recommended Posts

Summary of evaluation functions used in machine learning
Full disclosure of methods used in machine learning
Used in machine learning EDA
Machine learning ③ Summary of decision tree
Summary of methods often used in pandas
Summary of frequently used commands in matplotlib
[python] Frequently used techniques in machine learning
[Machine learning] List of frequently used packages
Machine learning tutorial summary
Machine learning ⑤ AdaBoost Summary
Summary of what was used in 100 Pandas knocks (# 1 ~ # 32)
Utilization of recursive functions used in competition pros
Summary of tools used in Command Line vol.8
Summary of tools used in Command Line vol.5
Get a glimpse of machine learning in Python
List of main probability distributions used in machine learning and statistics and code in python
A memorandum of method often used in machine learning using scikit-learn (for beginners)
About testing in the implementation of machine learning models
Summary of the basic flow of machine learning with Python
Basics of Machine Learning (Notes)
Machine learning ② Naive Bayes Summary
Machine learning article summary (self-authored)
Importance of machine learning datasets
Machine learning ④ K-nearest neighbor Summary
The result of Java engineers learning machine learning in Python www
Survey on the use of machine learning in real services
Summary of how to write .proto files used in gRPC
A beginner's summary of Python machine learning is super concise.
Machine learning summary by Python beginners
Automate routine tasks in machine learning
Summary of various operations in Tensorflow
Classification and regression in machine learning
Grammar summary often used in pandas
Machine learning in Delemas (data acquisition)
Python: Preprocessing in Machine Learning: Overview
A Tour of Go Learning Summary
Preprocessing in machine learning 2 Data acquisition
[Anaconda3] Summary of frequently used commands
Random seed research in machine learning
Preprocessing in machine learning 4 Data conversion
Coursera Machine Learning Challenges in Python: ex5 (Adjustment of Regularization Parameters)
Machine learning algorithm (generalization of linear regression)
scikit-learn How to use summary (machine learning)
Summary of frequently used commands of django (beginner)
[Linux] List of Linux commands used in practice
Summary of various for statements in Python
"Python Machine Learning Programming" Summary Note (Jupyter)
2020 Recommended 20 selections of introductory machine learning books
Summary of numpy functions I didn't know
Python: Preprocessing in machine learning: Data acquisition
Machine learning algorithm (implementation of multi-class classification)
Modules of frequently used functions in Python (such as reading external files)
Summary of stumbling blocks in installing CaboCha
Machine learning algorithm classification and implementation summary
Summary of modules and classes in Python-TensorFlow2-
Summary of built-in methods in Python list
[Python] Saving learning results (models) in machine learning
Python: Preprocessing in machine learning: Data conversion
[Summary of books and online courses used for programming and data science learning]
Machine learning algorithm (linear regression summary & regularization)
Python: Preprocessing in machine learning: Handling of missing, outlier, and imbalanced data