Wikipedia
Kaggle is a predictive modeling and analytical method related platform and its operating company where companies and researchers post data and statisticians and data analysts around the world compete for the optimal model.
Roughly speaking, the data scientist version of TopCoder
Today's engineer's taste is [Introduction to Machine Learning Theory for IT Engineers](https://www.amazon.co.jp/IT%E3%82%A8%E3%83%B3%E3%82%B8% E3% 83% 8B% E3% 82% A2% E3% 81% AE% E3% 81% 9F% E3% 82% 81% E3% 81% AE% E6% A9% 9F% E6% A2% B0% E5% AD% A6% E7% BF% 92% E7% 90% 86% E8% AB% 96% E5% 85% A5% E9% 96% 80-% E4% B8% AD% E4% BA% 95-% E6% 82% A6% E5% 8F% B8 / dp / 4774176982) I read through, but I don't have a chance to use machine learning at work because I'm not working in data analysis. You can't learn machine learning without actually moving your hands, but it seems to be difficult to move your hands because of the preparation of datasets, etc., and I don't know what it is appropriate to study as teaching materials (MNIST?). .. As a modern engineer, I would like to be able to utilize online learning and the community properly, and also gain knowledge of them. I liked the Kaggle competitions from Dataquest, which I tried as a tutorial.
Participate in Kaggle's Competition Titanic: Machine Learning from Disaster and win your place in the competition.
Getting Start
I have learned logistic regression and random forest by machine learning Apply some python code python runtime environment
Environment where you can write jupyter notebook English skills that are not so reluctant to read English technical sites
GPU environment Knowledge of deep learning
An online learning site that comes with Google at kaggle competition titanic.
Free Kaggle Tutorial - Getting Started with the Titanic Dataset
If you have a google or facebook account, you can start the tutorial as soon as you log in.
Based on the survival data of Titanic passengers, we created a parametric model that calculates the survival probability for passengers in the test data, and competed for the performance of the model. The theme is Kaggle's competition, and the data of the basic process of machine learning. From pre-processing, model creation, training dataset training to test dataset prediction, we will proceed with the Tutorial while solving the problem of writing python code at key points.
This screen is an example of the screen of the Dataquest tutorial.
An example appears on the left, write the python code on the right screen, press the execute button, and if the code result is correct, good work
will appear and proceed to the next screen.
The Tutorial even supports the generation of a file that submits the prediction results of the test dataset of the model that was actually created to kaggle. Dataquest has two Kaggle Competition courses, the first of which is a simple logistic regression model to experience the flow, and the next Improving Your Submission course is an ensemble learning model of accuracy. It is a course that focuses on aiming for a score by raising.
If you have the knowledge written in the premise, I think that each course can be completed in 2 hours.
After learning the Tutorial, organize what you've learned in your Python execution environment, and ... (although you just copy and paste the code you wrote in the Tutorial), and then submit the file to Kaggle. I tried to generate the code here.
Submit this to Kaggle!
It was 1231th in the 7071 team!
You can experience it in a short time without worrying about it, so I thought it would be a good material for self-study and study sessions after reading the introductory book on machine learning. This is a starting line, and even if you devise a model yourself, you may study in the direction of raising the score a little more. I also participate in other competitions. In my case, I'm also studying deep learning, so I refer to various sites for competiion of dogs vs cats. I tried to continue. For the time being, the score was 0.14160, and the score of about 697th / 1314 teams came out, but it seems that the competition is already closed, and if it is closed, it seems that it will not be registered in Rank, so it's a bit disappointing. did.
The author of Introductory Machine Learning Theory for IT Engineers was the same age as me and a graduate of the same Faculty of Science, Department of Physics (although at a different university). I would like to learn the weights of neurons by back-propagating the neural network of my brain with the author as teacher data to find out where and where there is such a difference. .. ..
Recommended Posts