[PYTHON] Uncle SE with hardened brain tried to study machine learning

[Introduction]

This article is the 19th day of "How did you learn machine learning by Nikkei xTECH Business AI ② Advent Calendar 2019".

I will write about how I studied machine learning as an uncle SE. The motivation for this was the classification task I had at the time, "Isn't it possible to use machine learning?" Honestly, my memory is ambiguous because I was in the dark clouds.

Career

The program history is nearly 30 years. Starting with MS BASIC, which I learned when I was in elementary school, I have touched Z80 assembler, MC68000 assembler, FORTRAN, C (UNIX), C ++ (Mac), VB, Java (Android), VB.NET, and C #. I've come into contact with various languages, but not all of them have been so tight.

Regarding machine learning, more than 20 years ago, at the end of the so-called second AI boom, I dealt with neural networks under the theme of my thesis. However, the logic part was handled by another person, and I was in charge of implementation, so I was not sure what it was. Thinking about it now, I think I should have studied harder at this time, but it's a later festival. By the way, there was no machine learning library at that time, so I implemented it in C language.

[Study of Python-Winter 2017]

Before studying machine learning, I was studying Python on my own. I decided to monitor my home with a Raspberry Pi, and used Pyhton to implement it. However, I didn't know the details to the extent that it could be copied and modified. The following books were referenced at that time.

[Color Illustrated Electronic Work Learned with the Latest Raspberry Pi]-Amazon Japan

The Raspberry Pi was ready to use Python by default, so it was a good place to start for me, who didn't even know the P character in Python. However, it is not realistic to buy a Raspberry Pi just for Python, so I recommend an online learning site now. Around me, "Paiza Learning" and "Progate" were popular for free. ..

["What is machine learning?" First, I bought a book --Spring-Autumn 2017]

It all started when I felt that there was a limit to the rule-based implementation of the classification issues I had at that time. I had the illusion that AI could do something about it, but when I couldn't get a handle on it, I learned that machine learning could be implemented using the Python library scikit-learn. So, I bought the following books that were newly released at that time.

[Machine learning starting with Python-features engineering and machine learning basics learned with scikit-learn]-Amazon Japan

I bought it simply because it contained the characters Python`` scikit-learn machine learning, and because it was an O'Reilly book that I had been indebted to for some time. To be honest, I could only read a part of Chapters 1 and 2 at Chimpung Kampung. What I learned in that process is as follows.

[Chapter 1 Introduction] ・ Python grammar ・ Existence of various libraries (scikit-learn, NumPy, pandas, matplotlib) ・ Iris Dataset

[Chapter 2 Supervised Learning] ・ K-nearest neighbor method ・ SVM

Output 1-Application to business

I think I didn't understand 10% of the content of the book at this point. Even so, I actually implemented it to see if it could be used for the issues I had at that time. The content was a multi-value classification of data with many parameters. At that time, it was implemented on a rule basis, but I was suffering from lack of accuracy. I was able to empirically visualize the data and look at it from a bird's-eye view, but I could hardly find regularity in the data such as noise, and I felt that it was impossible to improve the accuracy further on a rule basis. It was.

image.png The figure above is an image of the data being analyzed at that time. It was a multi-value classification problem with dozens of such data. Nowadays, there is no doubt that we will do it with machine learning, but at that time, people in the company, including myself, did not come up with that idea at all, and were desperately trying to implement it on a rule basis.

When I plunged this data into SVM and k-nearest neighbors, it was somehow classified. In particular, when the hyperparameters were brute-forced by the k-nearest neighbor method, some accuracy was observed, although the accuracy was slightly lower than that of the rule-based method. Although I was struggling on a rule basis, I felt the possibility that a model that could be classified with reasonable accuracy was completed in just a few minutes after inputting the data.

However, at that time, I didn't know how to improve the accuracy, and I decided that I couldn't spend more time, so I felt a response, but the adoption of machine learning was postponed.

What you learned

--You can now read / write Python code --Data visualization with pandas, matplotlib --A series of flow of "data preparation> model definition> fit (learning)> predict (inference)" (this flow is the same even if the model changes) ――For the platform you touch for the first time, try to classify irises for the time being ――Do not be afraid of the result and try it anyway

By the way, I was hospitalized for a long time at the end of this trial, and I reread the O'Reilly book in the hospital room. I was able to read more than last time, but I couldn't remember much because I couldn't code because I couldn't bring my PC to the hospital room. After all, I realized that I couldn't get it without moving my hands.

[Challenge to deep learning-Spring-Summer 2018]

I had been away from machine learning for about half a year since the above attempt, but I felt a response to machine learning, so I decided to try deep learning. At that time, I did a TensorFlow tutorial.

image.png 【TensorFlow 2 quickstart for beginners】 - Tensorflow

The content of the tutorial was a classification problem of MNIST (handwritten numbers), but at that time I could not read it by hitting the API of TensorFlow directly. Although I couldn't read it, I managed to implement the classification of irises, but I was frustrated because I couldn't get any accuracy.

After that, I learned about the existence of Keras and learned how to classify images by CNN using a sequential model while looking at a sample program.

image.png 【keras-team/keras/examples】- Github 【keras-team/keras/examples/mnist_cnn】- Github

Currently, the TensorFlow tutorial uses Keras (tf.keras), so I think it's a lot easier to do.

In addition, the following note [^ 1] summarizes the know-how that made it possible to use machine learning and TensorFlow on Raspberry Pi and Windows terminals at that time.

Get to know Google Colaboratory

The biggest thing I learned about deep learning was learning about the existence of Google Colaboratory (hereafter, Google Colab). By using GoogleColab, you can use TensorFlow and Keras just by starting it, and you can learn at high speed using GPU for free, so time resources have been reduced at once.

image.png 【Google Colaboratory】

This is one breakthrough, and I think my study of machine learning would not have progressed without ** Google Colab **. Now, the TensorFlow tutorial also leads me to Google Colab, which is a very useful tool. Once you have created a Google account, you can use it for free, so if you are worried about building an environment or slow processing speed, please use it.

Output # 2-Prototype image classification app

During this period, I repeatedly tried to improve the accuracy of MNIST on Google Colab. If you think about it now, you can challenge MNIST (Digit Recognizer) even within Kaggle, so I wish I had joined Kaggle earlier. think.

Using the technology acquired in this challenge, I implemented an image classification model that could be used for business, implemented WebAPI (Flask) in Python, and created a simple application that returns the result when an image is thrown. I think the accuracy was about 97.5%. The figure below is a PoC image that was done before making the prototype application.

image.png

When I reviewed it in-house, it was rejected as "There is no mistake in 25 cases out of 1000 cases". At that time, AI literacy, including myself, was low, so I couldn't argue with anything ...

Also, at this timing, I undertook a project to be implemented in Python. Although it was not directly related to machine learning, I learned a lot about how to use Pandas because it was a project that handles a large amount of data in DB and CSV files.

What you learned

--How to use Github --Implementation of CNN by Keras

[Challenge to G test-Winter 2019]

Output Part 3-Holding an in-house study session

When I got to know what CNN was, I became a lecturer and held an in-house study session on machine learning and deep learning. This was also a big turning point. When the knowledge about machine learning in the company spread to some extent, there was a move to promote machine learning throughout the section. This led to the actions that followed.

Output 4 --G certification acquisition

image.png I decided to challenge the G test with the participating members of the above study session. The books used at that time are as follows.

[Deep Learning Textbook Deep Learning G Test (Generalist) Official Text] [AI White Paper 2019]

At the same time, multiple employees were required to undergo the G certification, which improved internal AI literacy. Thanks to that, I have more opportunities to talk about machine learning in-house, which I couldn't talk to anyone before, and I myself have relieved my stress considerably.

To be honest, I don't think the G test can be used in practice, but I think it was very useful for employees to be able to speak using common terms. After all, I couldn't do it alone, and I felt that I needed to recruit friends and work hard.

What you learned

--Recognizing the importance of output --Terms related to machine learning and AI --Improvement of in-house AI literacy --Friends with the same purpose (in-house)

[Participation in hands-on seminar]

After passing the G test, I participated in various free hands-on seminars hosted by Google and Microsft. I participated in various ways, but it didn't have that much impact on me at that time. The seminar for beginners may have been unsatisfactory.

Participation in long-term seminars-Spring-Autumn 2019

Meanwhile, I had the opportunity to participate in a long-term seminar for half a year. At that time, the following video was provided as a pre-learning video.

[[Kikagaku style] Artificial intelligence / machine learning de-black box course-Beginner-]-Udemy [[Kikagaku style] Artificial intelligence / machine learning de-black box course --intermediate-]-Udemy

By watching this video, I was able to dig deeper into the machine learning that I had been using in a wind that I had somehow understood.

I will omit the long-term seminar here because it will be a miso in the foreground, but it was a difficult experience to work hard with other students for half a year as well as the content of the seminar.

image.png

Output Part 5-Mentoring for students

While participating as a student in the above seminar, I provided domain knowledge and technical support to the instructor. Since the latter half of the seminar was mainly for practical training, I participated as a mentor instead of a student. I wonder if I was able to do it well on my first opportunity, but I think it was a great place for self-study.

What you learned

--Mathematical mechanism of machine learning and deep learning --Familiarity with data preprocessing --Hyperparameter tuning --Generative model

【from now on】

Output 6-Formation of communities such as Mokumokukai

Currently, we are holding a machine learning "Mokumoku-kai". Initially it was for internal use, but we are supporting those who are trying new machine learning while making it a place where seminar participants and graduates can gather. I am also involved in "DEEP LEARNING LAB", so I will be actively involved in that as well.

Even in a small area, I would like to form a community where people can connect through machine learning and AI, and reduce the number of people who worry and give up alone.

image.png

[Summary]

Let's move our hands anyway

I wrote a lot, but there is definitely something I can say. It means that ** stretches the most when you move your hands **. Even if you read a book or an article on the Web and find out, you will stumble when you actually try to move it. You can understand it deeply when you actually move what is written. If possible, it's best to think and implement it yourself, rather than copying the code. Most of the time, I stumbled upon implementation. I think that you will learn that you faced various challenges during implementation and overcame them.

Initiatives for datasets

Anyway, even if you try to use existing datasets such as iris and MNIST, the expected value of skill improvement is low. It's good to try out the famous Titanic datasets as a tutorial, but honestly, these datasets don't motivate me, so I can't really work on them. When I try to manage data related to what I want to do, such as actual business problems, I encounter various problems and take action to solve them, which I acquire as a skill.

Even if you don't have a direct connection to your business, it's a good idea to ** find the dataset you are interested in and work on it ** on Kaggle. It doesn't matter if you're interested in datasets or prizes, but if it's not a motivational task, you'll quickly compromise when you run into problems. In the case of Kaggle, the result is directly linked to the rank, so the result is easy to understand, and it is recommended because you can compare it with your own method while looking at the code of the high rank person.

Find someone to talk to

The hardest part of working on the unknown realm of machine learning was that I didn't have anyone to talk to. It's not good mentally because it stops when you get stuck and it's not good to work alone. Many of the people I met at the seminar also received various consultations because ** no one could speak in the company **. We will guide you to the solution while listening to the story, but those who solve it while discussing each other, those who solve it by themselves while talking, bring colleagues to the "Mokumokukai" and increase the number of friends in the company and solve it together There were various people, such as those who did. In any case, talking to others tends to get you closer to solving the problem. Talking is also a kind of output, so I think it's very important to talk in order to organize your mind. Ideally, you should find someone you can talk to in real life at events such as seminars and mokumoku-kai as well as within the company. Even if that's not the case, I think it's possible to communicate online, such as on Kaggle's forums.

【in conclusion"

Japan is still immature when it comes to machine learning and AI. Most of the information is in English, and there are many scenes where you hit a wall without finding the information you are looking for. Let's stop worrying alone in such a case. It doesn't matter if you're a boss or a colleague, so let's involve others. If you're not around, look further out. There are many people in the world who are similarly worried. Nowadays in the internet society, you can easily find people in the same situation. I will continue to study so that I can help you, so let's do our best together.

[^ 1]: [How to run Python on Windows without polluting the environment as much as possible (Python embeddable version)]-Qiita
[On Windows How to run Python without polluting the environment as much as possible (WSL use Windows10, version 1607 or later only)]-Qiita
[GPU with TensorFlow for Windows Use (Install CUDA)]-Qiita
[[Building Raspberry Pi for Python and machine learning study]-Qiita](https://qiita .com / rhene / items / 71b92c253d5ac2a4cc52)
[Building RaspberryPi for Python and machine learning study (RaspberryPi4 & Buster version)-Qiita]

Recommended Posts

Uncle SE with hardened brain tried to study machine learning
I tried to move machine learning (ObjectDetection) with TouchDesigner
I tried machine learning with liblinear
I installed Python 3.5.1 to study machine learning
(Machine learning) I tried to understand Bayesian linear regression carefully with implementation.
A beginner of machine learning tried to predict Arima Kinen with python
I tried to visualize the model with the low-code machine learning library "PyCaret"
[Python] Easy introduction to machine learning with python (SVM)
I tried to study DP with Fibonacci sequence
Machine learning beginners tried to make a horse racing prediction model with python
Introduction to machine learning
Try to predict forex (FX) with non-deep machine learning
Site summary to learn machine learning with English video
I tried to compress the image using machine learning
[Keras] I tried to solve a donut-type region classification problem by machine learning [Study]
I tried to make a real-time sound source separation mock with Python machine learning
I tried to build an environment for machine learning with Python (Mac OS X)
Machine learning learned with Pokemon
An introduction to machine learning
Introduction to Machine Learning with scikit-learn-From data acquisition to parameter optimization
Machine learning beginners tried RBM
Machine learning with Python! Preparation
Machine Learning Study Resource Notepad
For those who want to start machine learning with TensorFlow2
I tried machine learning to convert sentences into XX style
Mayungo's Python Learning Episode 3: I tried to print numbers with print
Machine learning Minesweeper with PyTorch
I tried to implement ListNet of rank learning with Chainer
[Machine learning] I tried to summarize the theory of Adaboost
Machine learning to learn with Nogizaka46 and Keyakizaka46 Part 1 Introduction
Try to predict if tweets will burn with machine learning
Beginning with Python machine learning
Super introduction to machine learning
Try machine learning with Kaggle
I tried to divide with a deep learning language model
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Introduction ~
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Implementation ~
I tried to make deep learning scalable with Spark × Keras × Docker
Machine learning with python without losing to categorical variables (dummy variable)
[Introduction to StyleGAN] Unique learning of anime with your own machine ♬
Machine learning model management to avoid quarreling with the business side
[Machine learning] I tried to do something like passing an image
People memorize learned knowledge in the brain, how to memorize learned knowledge in machine learning
How to create a serverless machine learning API with AWS Lambda
Introduction to machine learning Note writing
Machine learning with python (1) Overall classification
Try machine learning with scikit-learn SVM
Introduction to Machine Learning Library SHOGUN
Quantum-inspired machine learning with tensor networks
Get started with machine learning with SageMaker
"Scraping & machine learning with Python" Learning memo
I tried learning LightGBM with Yellowbrick
How to collect machine learning data
(Machine learning) I tried to understand the EM algorithm in a mixed Gaussian distribution carefully with implementation.
I tried to make Othello AI with tensorflow without understanding the theory of machine learning ~ Battle Edition ~
I tried to implement deep learning that is not deep with only NumPy
[Introduction to machine learning] Until you run the sample code with chainer
Mayungo's Python Learning Episode 2: I tried to put out characters with variables
I tried to classify guitar chords in real time using machine learning
I tried fMRI data analysis with python (Introduction to brain information decoding)
Memorandum of means when you want to make machine learning with 50 images