In what order should we start studying machine learning? This was very annoying for me as well. After all, I didn't even have much basic knowledge, so I decided to summarize it because I could see somehow the direction as I went through various things. It is a mystery whether it will be helpful to everyone.

That's really correct. You can drive a car without knowing how the engine works. The important thing is what value you create with your car, you don't have time to reinvent the wheel. With scikit-learn for machine learning and tensorflow for deep learning, now and young AI can be done in no time!

That's true, but I think it's better to have the underlying theory and knowledge so that you can select tools more appropriately and effectively. Ah, I think it's very important to be able to intuitively understand that this case should be solved in this way.

In the end, I ended up with scikit-learn cheat sheet and Microsoft cheat sheet. .com / ja-jp / azure / machine-learning / algorithm-cheat-sheet). I felt that the classification of machine learning was well organized. scikit-learn is a little narrower, and I think Microsoft is better for dealing with a wider range of fields and new fields.

In the future, I will consider the python implementation without relying on the library from the theoretical story about the following items. Then I would like to proceed so that the library can be used. There are already a lot of similar and better articles, so I think I often give them away (declaration of omission).

I will omit the high school level mathematics (differential calculus, matrix, probability statistics, etc.) that is necessary for understanding and the basic usage of python.

There are three main categories.

Learn the outputs for various inputs and estimate the outputs for unknown inputs. Estimating the price of a house, estimating the quality of wine, and recognizing handwritten characters are realized by supervised learning.

It is used to organize high-dimensional data, project it onto low-dimensional data (dimension reduction), and categorize the data. Classification of irises is realized by unsupervised learning.

Learning the actions to take to maximize rewards. Programs that capture games, and what is called AI for Go and Shogi, are based on reinforcement learning.

Since it was originally a library specialized in machine learning, the basics are suppressed. For a detailed explanation, see "[What is Scikit-learn? Summary of what you can do with Scikit-learn in 5 minutes](https://ai-kenkyujo.com/2019/07/08/can-do-with-scikit-learn Let's give it to "/)", and first consider the basic algorithm.

There are a lot of regression analyzes,

- Linear regression
- Simple regression
- Multiple regression
- Basis set regression
- Gradient descent method
- Regularization: Lasso regression / Ridge regression / ElasticNet regression

Around.

Distinguish between dogs, cats and birds, and recognize characters

- 2 Class classification
- Simple Perceptron
- Logistic Regression
- Support vector machine
- Basic
- Application
- Multi-class classification
- 2 Multi-class classification
- Theory
- Implementation
- k-nearest neighbor method (kNN method)
- Decision tree
- Bagging
- Boosting
- Random forest

Such.

- k-means method

And.

- PCA (Principal Component Analysis)
- Kernel principal component analysis
- Matrix Factorication

Microsoft is very focused on machine learning and has published a lot of papers.

Morphological analysis, statistical analysis vectorization, etc.

Eventually, I'll study neural networks and deep learning, and I'll move on to kaggle, but I'd like to start with the classics.

Recommended Posts