Introduction

I received a one-month free research period from Company, so I started studying machine learning, which I had been interested in for some time. Since it's a big deal, I'll keep a record of my learning. Today, the first day, it was a disorganized reference to the literature and learning methods for learning machine learning.

List of past articles

Goal setting

We have set the goals for one month as follows.

You will be able to propose solutions to new problems using a machine learning approach.
You will be able to make proposals to replace the work that people have done manually with computer work.
Put into practice what you have proposed, or create a project that can be put into practice.
Solve simple problems that are actually present and use them as the result of independent research.

Advance preparation

First of all, in order to spend this short month meaningfully, I prepared some books instead of the run-up before the free study period.

The book I bought (not read)

Data mining for business
A good book that a colleague brings out every time an in-house LT
I want to go back and forth between theory and practice, so I want to read it in the practice part.
(I like theory, but I get tired of theory alone)
First pattern recognition
This is also one of the practical examples of machine learning
Introduction to Statistical Modeling for Data Analysis
Known as "Midoribon"
It is recommended to include this as knowledge of statistics.
Natural language processing (The Open University of Japan teaching materials)
Some of the envisioned tasks obviously require knowledge of natural language
Deep Learning (Machine Learning Professional Series)
If you do it anyway, you want to go to deep learning.
I personally like Go and poker, so I'd be happy if I could apply it there.
Interest frame.

The book I bought and read

Introduction to Strategic Data Science
What is data science?
What can and cannot be done
Usefulness of data science
If you read it first, you will feel more excited before studying!
(If you don't read this and get excited, it's not suitable.)
It is good to read it in the sense of judging it.
Collective Intelligence Programming
I read it 8 years ago, so I would like to review it.
At that time I even started Python to read this.
This time, I reconfirmed only the table of contents.
Introduction to Machine Learning Theory for IT Engineers
Leave the title. I learned the theory of machine learning broadly and shallowly.
The reading memo has become a linked list for future learning.
Many tools are introduced, so it is easy to practice.
I will summarize the reading notes at the end as an appendix to this article.

Finally free research start

And today, which is the first day, I went through various sites and slides at random to decide what kind of course to study.

Seen slide

The site I saw

Stanford University Machine Learning Online Course

Machine Learning - Stanford University | Coursera

Finished WEEK 1

The machine learning field is in demand anyway!
Machine learning can be broadly divided into "supervised learning and unsupervised learning."
(I'm interested in supervised learning for the time being)
Supervised learning means that the correct answer is given.
The list of features can be infinitely long, but there are algorithms that can handle it.
Regression problem => Prediction of continuous value output
Classification problem => Prediction of discrete value output
Choose Octave as the language to use when learning machine learning!
Fastest learning of machine learning with Octave in past achievements
Than Python.
Python I was trying to prepare various things ...
Impressions after WEEK 1
About 3 hours per WEEK.
It will take about March if you do it according to the schedule
If you do your best, one WEEK in the morning and another WEEK in the afternoon
The content is similar to "Introduction to Machine Learning Theory for IT Engineers"
However, it is easy to understand because it explains more slowly and in detail.

From tomorrow, we will be holding this Stanford machine learning course. It seems that there is up to WEEK 11, and even if you do one in the morning and one in the afternoon, it will end in a week, so I would like to finish this course before Christmas and get a feel for it.

Then, when I try to move my hand, I don't want to be context-switched because I'm stumbling or taking time in a non-essential place, so I'll review the machine learning environment in Python and create the strongest environment tomorrow. I thought, but I have to create an Octave environment as well.

What to do tomorrow

Machine Learning - Stanford University | Coursera
This WEEK 2, WEEK 3
(↑ This alone consumes 6 hours ...)
Creating an Octave environment
Review (or rebuild) Python machine learning environment
pyenv / virtualenv
anaconda
IPython
Jupyter
NumPy
pandas
SciPy
matplotlib
PIL(Python Image Library)
scikit-learn

appendix

Introduction to Machine Learning Theory Reading Memo

Overview

Introduction to Machine Learning Theory for IT Engineers Technical Review Company Etsuji Nakai

Classification of machine learning algorithms

Classification: Algorithm that produces class judgment
Regression analysis: Algorithm for predicting numerical values
Clustering: An algorithm for unsupervised grouping
Other (not handled)
Similar matching
Co-occurrence analysis
Link prediction

Review term

Least squares
Training set, feature variables, objective variables
Polynomial approximation, error function
Overfitting problem
Maximum likelihood estimation method
Setting the probability of data generation
Parameter evaluation (parameter that maximizes probability)
Estimator evaluation (match positive and universality)
Perceptron
The equation of the straight line that divides the plane
Evaluation of classification results by error function
Stochastic gradient descent-> Parameter modification by gradient vector
Geometric interpretation
Bias term arbitrariness and algorithm convergence speed
Geometric interpretation / Geometric meaning of bias term
Logistic regression
Definition of data occurrence probability
Determining parameters by maximum likelihood estimation method
ROC curve
Application of logistic regression to real problems
Performance evaluation by ROC curve
IRLS method
k-means (means?)
Basics of unsupervised learning model
EM algorithm
Unsupervised learning model by maximum likelihood estimation method
Bernoulli distribution
Bernui mixture distribution
Clustering by EM algorithm
Bayesian inference
Bayes' theorem
Application of Bayesian inference to regression analysis

Review analysis tool

Further reference books

Introduction to Strategic Data Science-> O'Reilly's guy
Pattern recognition and machine learning-> The next book to read
Introduction to Statistical Modeling for Data Analysis Takuya Kubo Iwanami Shoten-> Bought
Introduction to Data Analysis with Python-> It was in the library

[PYTHON] Bringing machine learning to a practical level in one month # 1 (Starting edition)