[PYTHON] Judgment of igneous rock by machine learning ②

Development policy

① Programming with machine learning

Let's make it using a programming language called Python, which is often used in machine learning. It will be a little more difficult, but it will be analyzed by machine learning. Download about 10 machine learning data images for machine learning.

② When completed, make it an app.

If possible, I would like to run it on a smartphone such as Android.

③ Try even a program that does not use machine learning

Even if you don't use machine learning, create a program that distinguishes the type from the color of the rock. Try to separate the images by the method of distinguishing from the color characteristics. Development process ① Program development to judge by machine learning Development days-> 5 days

(1) Image collection

I collected 10 photos for each item by Google image search. Adjust the brightness and size of the photos to sort them. I tried to download it using google API, but it was quite difficult due to restrictions. After all, I decided to download it manually.

(2) Import error I tried to build a machine learning environment, but it didn't work. I wasted about 4 hours with all these errors. ImportError: DLL load failed: Dynamic link library (DLL) initialization routine execution failed. This error is a program file used for apps such as dynamic link library, and it was written that initialization failed when read carefully. After investigating, it was found that a special instruction called avx instruction was used from tensoflow 1.7, and low-priced CPUs such as Intel celeron, Intel Pentium and Intel atom could not be used. It is written on the Internet site that it can be executed by changing the version to Tensorflow 1.7, and I tried downgrading to Tensorflow 1.7. However, it turned out that keras does not support when Tensorflow is 1.7. I had no choice but to test it on another computer.

(3) Development by keras

I started machine learning using a Tensorflow extension called keras. However, the program on the site he referred to was very buggy and took a long time to fix. I tried to find out how to fix it using various sites After machine learning, I tried a verification experiment. However, the only answer was basalt, so I downloaded the rock image again and cropped the image.

(4) Remake it again

I was organizing my image folders, but I accidentally overwrote them and lost my machine learning program. In the end, he ended up making it from the beginning. As in the last time, we recreated the machine learning program, conducted verification experiments, and completed the system.

(5) Asked an expert if it could be determined by machine learning.

Even after organizing the images, only basalt was the answer, so it may be difficult to raise the correct answer rate in machine learning. Therefore, I asked Mr. Kuniyasu Mokudai, an expert who saw the materials, on Facebook.

Question: "Igneous rocks cannot be accurately identified by color or texture, and may be determined by silicon dioxide or the place of collection, so it may be quite difficult with AI." Answer: "As expected, igneous rocks can be discriminated to some extent with the naked eye, but basically they are determined by their chemical properties (silicon dioxide content), so if it is a photo-based classification, the correct answer rate is 30 to 50%. I think it's a reasonable place. "

In other words, igneous rocks could not be distinguished by color or texture, and the method of discrimination was the amount of silicon dioxide. It was found that the silicon dioxide content was more related to the classification of igneous rocks than the color and texture. I found it difficult to get a correct answer rate of 80% to 100%, but I decided to try to raise it as much as possible.

(6) Increase the percentage of correct answers

First, two types of igneous rocks were selected in a test to increase the correct answer rate (for example, rhyolite and granite), and when the images were discriminated, the correct answer rate was about 90%. Next, I increased the number to 6 types of igneous rocks, but the correct answer rate dropped to 30% to 60%. I thought that the phenomenon of lack of learning was probably occurring. Lack of training occurs when the model still has room for improvement on the test data. There are various causes of lack of learning, such as the model is not strong enough, it is too regularized, or the training time is simply too short. Lack of learning appears to have not fully learned the relevant patterns in the training data. Therefore, I increased the number of learning times to about 30 times. The accuracy has increased to about 90%. It turns out that the number of learnings is important.

(7) Overfitting occurs

However, the percentage of correct answers at the time of verification did not increase so much. The cause was "overfitting" (also called overfitting or overfitting) that occurs when the number of images is small and the number of learnings is large. There was a phenomenon in which even detailed image-specific information was learned (for example, the shape of shadows and patterns). When overfitting occurs, it seems that it is difficult to discriminate with unknown data because it contains information specific to the training data. As a countermeasure, we decided to increase the number of images. Next, I also tried using regularization. Regularization imposes constraints on the amount and type of information stored in the model and suppresses overfitting in a way that learns only the important points. I also tried using one more method, early stopping. Early stopping means that if there is a sign that overfitting will start, the learning data will be saved at that point and the learning will end. It seems that overfitting may occur even when the images are very similar or have no features, and it is possible that overfitting occurs because the training data does not have many features. After all, I found that machine learning needs adjustment to prevent under-learning and over-learning.

Result: The learning data was completed.

② Challenge to Android

Development days 6 days

(1) There are many errors

Started creating Android apps. I tried to convert the extension of Keras learning data hdf5 to the extension of Tensorflow learning data pb, but there are many bugs and it is difficult to fix.

(2) Ask questions online

I asked a site called Teratai how to fix a bug. However, I couldn't do it because I didn't get an answer. I think it was because the question was difficult. https://teratail.com/questions/268059

(3) Format conversion to tflile

Instead of converting Pb files, I decided to convert the format called tflile, which is often used in Tensorflow's Android and ios. Then it worked. I tried to make it using a sample dedicated to Tensorflow's Android sample, but since the content of the site I was referring to was the content of a year ago, I could not do it after all because I did not know where and how to do it.

As a result, I couldn't.

③ Change to a web browser on the rental server

Development days 3 days

I modified it so that it works on a web browser instead of changing direction and running on Android.

(1) Use a webcam I succeeded in running it in a web browser using a webcam. However, the percentage of correct answers was very low. The reason is that a webcam can judge the scenery around the rocks, so I thought it couldn't be classified well.

Trial production https://miyadai.sakura.ne.jp/tensorflow1/

(2) Capture images

I decided to judge by the image instead of using the webcam, so the correct answer rate increased. It seems that the image capture is because the brightness of the photo is corrected unlike the webcam. I rented a server and published it on the Internet.

As a result, it was completed.

Trial production https://miyadai.sakura.ne.jp/tensorflow2/

Recommended Posts

Judgment of igneous rock by machine learning ②
Classification of guitar images by machine learning Part 1
Analysis of shared space usage by machine learning
Reasonable price estimation of Mercari by machine learning
Basics of Machine Learning (Notes)
4 [/] Four Arithmetic by Machine Learning
Predict the presence or absence of infidelity by machine learning
Significance of machine learning and mini-batch learning
Machine learning ③ Summary of decision tree
Machine learning
A memorandum of scraping & machine learning [development technique] by Python (Chapter 4)
A memorandum of scraping & machine learning [development technique] by Python (Chapter 5)
Machine learning algorithm (generalization of linear regression)
Making Sandwichman's Tale by Machine Learning ver4
[Learning memo] Basics of class by python
2020 Recommended 20 selections of introductory machine learning books
[Failure] Find Maki Horikita by machine learning
Four arithmetic operations by machine learning 6 [Commercial]
Machine learning algorithm (implementation of multi-class classification)
[Machine learning] List of frequently used packages
Python & Machine Learning Study Memo ④: Machine Learning by Backpropagation
Python learning memo for machine learning by Chainer Chapter 13 Basics of neural networks
Python learning memo for machine learning by Chainer until the end of Chapter 2
Judge the authenticity of posted articles by machine learning (Google Prediction API).
Machine learning memo of a fledgling engineer Part 1
Little girl image judgment system Lolinco machine learning
Beginning of machine learning (recommended teaching materials / information)
Try to forecast power demand by machine learning
Machine learning of sports-Analysis of J-League as an example-②
Machine Learning: Image Recognition of MNIST by using PCA and Gaussian Native Bayes
Python & Machine Learning Study Memo ⑤: Classification of irises
Numerai Tournament-Fusion of Traditional Quants and Machine Learning-
Full disclosure of methods used in machine learning
Predict short-lived works of Weekly Shonen Jump by machine learning (Part 2: Learning and evaluation)
List of links that machine learning beginners are learning
I tried to predict the presence or absence of snow by machine learning.
Parallel learning of deep learning by Keras and Kubernetes
Overview of machine learning techniques learned from scikit-learn
About the development contents of machine learning (Example)
Summary of evaluation functions used in machine learning
Classify machine learning related information by topic model
Improvement of performance metrix by two-step learning model
Stock price forecast by machine learning Numerai Signals
[Memo] Machine learning
Get a glimpse of machine learning in Python
Machine learning classification
Predict short-lived works of Weekly Shonen Jump by machine learning (Part 1: Data analysis)
Deep learning learned by implementation (segmentation) ~ Implementation of SegNet ~
Try using Jupyter Notebook of Azure Machine Learning
Arrangement of self-mentioned things related to machine learning
Machine Learning sample
Causal reasoning using machine learning (organization of causal reasoning methods)
Key points of "Machine learning with Azure ML Studio"
[Recommended tagging for machine learning # 2] Extension of scraping script
[Recommended tagging for machine learning # 2.5] Modification of scraping script
About data preprocessing of systems that use machine learning
Impressions of taking the Udacity Machine Learning Engineer Nano-degree
Python learning memo for machine learning by Chainer from Chapter 2
Python learning memo for machine learning by Chainer Chapters 1 and 2
About testing in the implementation of machine learning models
Predict the gender of Twitter users with machine learning