[PYTHON] Machine learning / classification related techniques

Logistic regression

from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()
X = data.data
y = 1 - data.target
#Invert labels 0 and 1

X = X[:, :10]
from sklearn.linear_model import LogisticRegression
model_lor = LogisticRegression(max_iter=1000)
model_lor.fit(X, y)
y_pred = model_lor.predict(X)

Mixed matrix

・ 2 rows x 2 columns are displayed ・ A matrix of real data and forecast data is created ・ Upper left is (0, 0), lower right is (1, 1)

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y, y_pred)
print(cm)

Correct answer rate

・ Percentage of correct forecasts for all forecast results

from sklearn.metrics import accuracy_score
accuracy_score(y, y_pred)

Compliance rate

・ Percentage of what was predicted to be positive to what was predicted to be positive (Right column side)

from sklearn.metrics import precision_score
precision_score(y, y_pred)

Recall

・ Percentage of what can be correctly predicted as positive to what is actually positive (Descending side)


from sklearn.metrics import f1_score
f1_score(y, y_pred)

F value

・ Harmonic mean of recall and precision ・ There is a trade-off between the precision rate and the recall rate.

from sklearn.metrics import f1_score
f1_score(y, y_pred)

Predicted probability

・ A method of expressing whether it is classified as 0 or 1 by a continuous value of 0-1 (when added, it becomes equal to 1) ・ 0.5 is set as the threshold for scilit-learn by default.


#model_lor.predict_proba(X)

import numpy as np
y_pred2 = (model_lor.predict_proba(X)[:, 1]>0.1).astype(np.int)
print(confusion_matrix(y, y_pred2))

print(accuracy_score(y, y_pred2))
print(recall_score(y, y_pred2))

ROC curve / AUC (study required)

・ AUC: Area Under the Curve ・ ROC: Receiver Operating Characteristic ・ AUC is the lower area of the ROC curve ・ ROC curve ・ ・ ・ Horizontal axis: False Positive Rate, FP Vertical axis: True Positive Rate, TP


from sklearn.metrics import roc_curve
probas = model_lor.predict_proba(X)
fpr, tpr, thresholds = roc_curve(y, probas[:, 1])

%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

fig, ax = plt.subplots()
fig.set_size_inches(4.8, 5)

ax.step(fpr, tpr, 'gray')
ax.fill_between(fpr, tpr, 0, color='skyblue', alpha=0.8)
ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_facecolor('xkcd:white')
plt.show()

from sklearn.metrics import roc_auc_score
roc_auc_score(y, probas[:, 1])

Recommended Posts

Machine learning / classification related techniques
Machine learning classification
Supervised machine learning (classification / regression)
Machine learning
Machine learning with python (1) Overall classification
Classification and regression in machine learning
[Machine learning] LDA topic classification using scikit-learn
[python] Frequently used techniques in machine learning
Machine learning algorithm (implementation of multi-class classification)
Machine learning algorithm classification and implementation summary
Supervised learning (classification)
[Memo] Machine learning
Machine Learning sample
EV3 x Pyrhon Machine Learning Part 3 Classification
Classification of guitar images by machine learning Part 1
Deep Learning from scratch ① Chapter 6 "Techniques related to learning"
Python & Machine Learning Study Memo ⑤: Classification of irises
Machine learning algorithms (from two-class classification to multi-class classification)
Overview of machine learning techniques learned from scikit-learn
Classify machine learning related information by topic model
Classification of guitar images by machine learning Part 2
Arrangement of self-mentioned things related to machine learning
Machine learning tutorial summary
About machine learning overfitting
Machine learning ⑤ AdaBoost Summary
Machine Learning: Supervised --AdaBoost
Machine learning logistic regression
Machine learning support vector machine
Studying Machine Learning ~ matplotlib ~
Machine learning linear regression
Machine learning course memo
Machine learning library dlib
Machine learning (TensorFlow) + Lotto 6
Somehow learn machine learning
Machine learning library Shogun
Machine learning rabbit challenge
Introduction to machine learning
Python: Supervised Learning (Classification)
Machine Learning: k-Nearest Neighbors
What is machine learning?
[Machine learning] Text classification using Transformer model (Attention-based classifier)
Machine learning model considering maintainability
Machine learning learned with Pokemon
Data set for machine learning
Japanese preprocessing for machine learning
Python Machine Learning Programming Chapter 2 Classification Problems-Machine Learning Algorithm Training Summary
Machine learning in Delemas (practice)
An introduction to machine learning
Machine Learning: Supervised --Linear Regression
Basics of Machine Learning (Notes)
Machine learning beginners tried RBM
[Machine learning] Understanding random forest
Machine learning with Python! Preparation
Try to evaluate the performance of machine learning / classification model
Machine Learning Study Resource Notepad
Machine learning ② Naive Bayes Summary
Personal memos and links related to machine learning ③ (BI / Visualization)
Understand machine learning ~ ridge regression ~.
Machine learning article summary (self-authored)
About machine learning mixed matrices
Machine Learning: Supervised --Random Forest