[PYTHON] Introduction to machine learning Note writing

Aidemy 2020/9/23

Introduction

Hello, it is Yope! I'm a liberal arts college student, but I'm interested in the AI field, so I'm studying at the AI-specialized school "Aidemy". I would like to share the knowledge gained here with you, and I am summarizing it on Qiita. I am very happy that many people have read the previous summary article. Thank you! This time, I will write down important notes about the introduction to machine learning.

Introduction to Machine Learning 1 Key Points

・ Machine learning includes __ "supervised learning", "unsupervised learning", and "reinforcement learning" __. ・ Supervised learning is a method of thinking until the answer is correct by giving learning data and correct answer (teacher) data. Mode most often. -Unsupervised learning is a method in which only learning data is given and the computer itself finds regularity. ・ Reinforcement learning is a method of continuing to think to maximize the benefits (rewards) that the actor can obtain.

Introduction to Machine Learning 2 Key Points

・ __ Supervised learning procedure __: Data collection → Data cleansing → Learning → Check with test data → Implementation

・ __ Supervised learning practice 1 __: Holdout method: Data is divided into learning data and test data. Use the train_test_split () function. __X_train, X_test, y_train, y_test = train_test_split (X, y, test_size = test data percentage, random_state = 0) __

・ __ Supervised learning practice 2 __: k-Cross-validation: Data is divided into k and one of them is used as test data. The test data is changed each time and verified a total of k times to calculate the average performance. (If there are 20 ex data, 19 will be training data and 1 will be test data, and will be verified 20 times in total)

-__ Overfitting : A state in which the learning accuracy is too high to be properly abstracted and unknown data cannot be handled. - Dropout __: A means of avoiding overfitting. Ignore obvious exceptions. ・ __Ensemble learning __: Improve accuracy by training multiple models and averaging the results.

Introduction to Machine Learning 3 Key Points

· __Confusion matrix __: A table used to evaluate the accuracy of the model. The results are classified into __ "true positive", "false positive", "true negative", and "false negative" __. "True or false" indicates whether the answer is correct, and "Yin" indicates the answer of the model. (That is, if it is a "false positive", the model answered True, but the answer was False.)

-Implementation of confusion matrix: Describe as follows ("y_true" is given [list of correct answers], and "y_pred" is given [list of model answers])

from sklearn.metrics import confusion_matrix
#Define "correct answer" and "answer" in a list(0 is positive, 1 is negative)
y_true=[1,1,1,1,1,1]
y_pred=[1,1,1,0,0,0]

confmat = confusion_matrix(y_true, y_pred)
#[[0 0]   #[[True positive False negative]
# [3 3]]  # [False positives True negatives]]

・ __Correct answer rate __: Percentage of all answers that was "true". (True positive + True negative / Overall) ・ __Compliance rate / Accuracy __: Percentage of those who answered "positive" was "true". (True positive / True positive + False positive) -__Recall rate __: Percentage of "actual sun" that was "true". (True positive / True positive + False negative) -__F value __: Harmonic mean of precision and recall (2 * precision * recall / precision + recall)

-Implemented the above evaluation index: It calculates by importing the function and giving "y_true" and "y_pred" as arguments respectively.

#precision_score: Match rate, recall_score: recall, f1_score: Import of F value
from sklearn.metrics import precision_score,recall_score,f1_score

y_true=[0,0,1,1]
y_pred=[0,1,1,1]
#F value output
print("F1".format(f1_score(y_true,y_pred)))
# 0.666666

-__PR curve __: A graph with the precision rate on the vertical axis and the recall rate on the horizontal axis. ・ __The precision rate and the recall rate are in a trade-off relationship , and it is necessary to consider which one should be emphasized in some cases. Unless you are particular about it, it is recommended to use the F value or the point where P and R match on the PR curve ( breakeven point (BEP) __).

Recommended Posts

Introduction to machine learning Note writing
Introduction to machine learning
An introduction to machine learning
Super introduction to machine learning
Introduction to Machine Learning Library SHOGUN
Introduction to Machine Learning: How Models Work
An introduction to OpenCV for machine learning
Introduction to ClearML-Easy to manage machine learning experiments-
An introduction to Python for machine learning
[Python] Easy introduction to machine learning with python (SVM)
[Super Introduction to Machine Learning] Learn Pytorch tutorials
An introduction to machine learning for bot developers
[Super Introduction to Machine Learning] Learn Pytorch tutorials
[For beginners] Introduction to vectorization in machine learning
[Learning memorandum] Introduction to vim
Introduction to Cython Writing [Notes]
Introduction to Deep Learning ~ Learning Rules ~
Deep Reinforcement Learning 1 Introduction to Reinforcement Learning
Introduction to Deep Learning ~ Backpropagation ~
An introduction to machine learning from a simple perceptron
Introduction to Machine Learning with scikit-learn-From data acquisition to parameter optimization
Introduction to Deep Learning ~ Function Approximation ~
Machine learning
Introduction to Deep Learning ~ Coding Preparation ~
Machine learning to learn with Nogizaka46 and Keyakizaka46 Part 1 Introduction
Introduction to Deep Learning ~ Dropout Edition ~
Introduction to Deep Learning ~ Forward Propagation ~
Introduction to Deep Learning ~ CNN Experiment ~
How to collect machine learning data
Python learning memo for machine learning by Chainer Chapter 8 Introduction to Numpy
Introduction to Python Basics of Machine Learning (Unsupervised Learning / Principal Component Analysis)
Before the introduction to machine learning. ~ Technology required for machine learning other than machine learning ~
Python learning memo for machine learning by Chainer Chapter 10 Introduction to Cupy
[Introduction to StyleGAN] Unique learning of anime with your own machine ♬
Python learning memo for machine learning by Chainer Chapter 9 Introduction to scikit-learn
[Note] AI / machine learning / python related websites [updated from time to time]
scikit-learn How to use summary (machine learning)
Record the steps to understand machine learning
I installed Python 3.5.1 to study machine learning
Introduction to Deep Learning ~ Convolution and Pooling ~
"Python Machine Learning Programming" Summary Note (Jupyter)
How to enjoy Coursera / Machine Learning (Week 10)
Introduction to Machine Learning-Hard Margin SVM Edition-
Introduction to TensorFlow-Machine Learning Terminology / Concept Explanation
Introduction to MQTT (Introduction)
Introduction to Scrapy (1)
Introduction to Scrapy (3)
[Introduction] Reinforcement learning
[Python] Learning Note 1
Introduction to Tkinter 1: Introduction
[Introduction to machine learning] Until you run the sample code with chainer
Introduction to PyQt
Introduction to Scrapy (2)
[Linux] Introduction to Linux
Take the free "Introduction to Python for Machine Learning" online until 4/27 application
Introduction to Scrapy (4)
Python beginners publish web applications using machine learning [Part 2] Introduction to explosive Python !!
Introduction to discord.py (2)
[Memo] Machine learning
Machine learning classification
Introduction to discord.py