[PYTHON] Two weeks after starting machine learning, what it took to start machine learning

Introduction

Recently, when I started studying machine learning as a hobby, I got tremendous skills because I needed various skills! Share the tasks you need to get started!

What you need for machine learning

Required skills

First of all, if you start by yourself, write down the necessary skills.

・ Business skills
・ Deep understanding of business
·logical thinking
·documentation/presentation
* In many situations, a theoretical explanation is required from data collection.

・ IT skills
・ Wide range of IT knowledge
・ Knowledge about large-scale data processing
・ Deep knowledge about databases
·programming
* Since the range is really wide, we need someone who can make a full stack from infrastructure to logic.

・ Statistical analysis skills
·Math
・ Understanding data analysis methods
・ Data analysis software skills
* Probability / statistics, calculus, matrix, etc. are required.
At first I couldn't understand at all because there were many symbols.

This article is easy to understand. It's about time to think seriously about the definition and skill set of data scientists --Qiita

About the task of the main subject

marketing

This is probably the most important thing in machine learning!

Because, even if you collect data, you cannot know the data to collect unless you are familiar with the industry. This time, I received various advice, but due to lack of data, I collected it many times!

Data collection

This is also a difficult task. If it is about 1,000 to 10,000, it may be possible to do it manually, but I think that it is better to have at least 100,000. I started collecting 1,300,000 records!

How to collect

[Do it yourself] Collect information for free with web scraping. It requires time and skill to develop as a trade-off.

[Crowdsourcing] This sounds cheap and good, but it can reduce the quality of the information.

[Buy] It's quick and quick, but it's relatively expensive (depending on the information).

Information cleansing

The collected data may be partially missing or may contain irregular information. So you need to organize the information so that it is easy to analyze. Here, skills for databases such as SQL and skills for Big Data processing (distributed processing, etc.) are required. Since it processes a large amount of data, it needs to take some time.

Analysis / classification

To start making predictions with machine learning, you first need statistical skills to analyze current data. You need to characterize the data and infer the data you need to make predictions. This area requires quantitative evaluation without relying on qualitative things such as common sense.

Learning model creation

Create a model to predict the future from the characteristics of the data. Programming skills are required here. A development language such as Python is a prerequisite, but there are also recommendations such as parameter tuning, so the hurdle is not that high, but you have to select an algorithm according to the type and purpose of the data. .. [List of machine learning libraries]   TensorFlow:https://www.tensorflow.org/   Chainer:http://chainer.org/   Caffe:http://caffe.berkeleyvision.org/   Theano:http://deeplearning.net/software/theano/index.html   Torch:http://torch.ch/   scikit-learn:http://scikit-learn.org/stable/   PyML:http://pyml.sourceforge.net/   Pylearn2:http://deeplearning.net/software/pylearn2/   PyBrain:http://pybrain.org/pages/home

Prediction and experimentation

Predict using a learning model created based on training data. It is necessary to repeat the learning-test prediction many times to experiment. Also, just because the accuracy here is good, there is a possibility of overfitting.

You need to repeat the experiment with the actual data to see if it really works.

Finally

Since there are not many machine learning engineers yet, it may be difficult to consult with people about algorithm selection, parameter selection, normalization methods, etc., and I feel lonely. However, there are so many new things to learn and it is exciting. I highly recommend it to everyone.

There is a site where you are experimenting with future prediction with the skills you are studying, so please take a look when you have time! [facebook] https://www.facebook.com/AIkeiba/ [Twitter] https://twitter.com/Siva_keiba

Recommended Posts

Two weeks after starting machine learning, what it took to start machine learning
Preparing to start "Python machine learning programming" (for macOS)
Introduction to machine learning
What is machine learning?
An introduction to machine learning
Super introduction to machine learning
Introduction to machine learning Note writing
Deep learning to start without GPU
Introduction to Machine Learning Library SHOGUN
How to collect machine learning data
Is it possible to eat by forecasting stock prices by machine learning [Machine learning part 1]
Bringing machine learning to a practical level in one month # 1 (Starting edition)