Data science and machine learning look interesting! Have you ever felt that you don't know how to study? By the way, I am one of them. In this article, *** What is machine learning in the first place? Even beginners who know the word AI but do not know the details can learn how to study *** from the basics necessary to gain knowledge and experience and be able to work on machine learning. I will introduce it based on my experience! (I think that the ones introduced here can be used as a review of basic knowledge even for those who are intermediate or above in machine learning.)
[amazon link](https://www.amazon.co.jp/%E5%9B%B3%E8%A7%A3%E5%8D%B3%E6%88%A6%E5%8A%9B-%E6 % A9% 9F% E6% A2% B0% E5% AD% A6% E7% BF% 92-% E3% 83% 87% E3% 82% A3% E3% 83% BC% E3% 83% 97% E3% 83% A9% E3% 83% BC% E3% 83% 8B% E3% 83% B3% E3% 82% B0% E3% 81% AE% E3% 81% 97% E3% 81% 8F% E3% 81% BF% E3% 81% A8% E6% 8A% 80% E8% A1% 93% E3% 81% 8C% E3% 81% 93% E3% 82% 8C1% E5% 86% 8A% E3% 81% A7% E3% 81% 97% E3% 81% A3% E3% 81% 8B% E3% 82% 8A% E3% 82% 8F% E3% 81% 8B% E3% 82% 8B% E6% 95% 99% E7% A7% 91% E6% 9B% B8-% E6% A0% AA% E5% BC% 8F% E4% BC% 9A% E7% A4% BE% E3% 82% A2% E3% 82% A4% E3% 83 % 87% E3% 83% 9F% E3% 83% BC / dp / 429710640X)
In the field of learning for the first time, the first problem that comes up is "I don't understand Japanese" (I don't understand technical terms). I think this is common to all disciplines, but I think it's especially true when it comes to machine learning. This book is recommended to overcome such a situation, and it is exactly a "textbook" that teaches you from the basics.
Chapter 1: Basic knowledge of artificial intelligence (A good basic knowledge such as AI, machine learning, deep learning, historical background, etc. is summarized at the beginning) Chapters 2-4: About Machine Learning (Explains from basic knowledge that often appears when looking at machine learning literature to algorithms) Chapters 5-7: About Deep Learning (Explanation of processes and algorithms from basic knowledge of deep learning) Chapter 8: System environment and development environment (Explanation of machine learning libraries and frameworks, deep learning frameworks from the point of program language selection)
First read (read through the words with the feeling of skipping them even if you don't understand them) → If you just do this, when you look at other articles and literature, it will start with words you have seen, not words you do not understand, so it will be easier to learn!
Reread the part where the basic knowledge is written + the final chapter that understands the flow of development (Chapter 1, Chapter 2, Chapter 5, Chapter 8 in terms of chapters)
After reading to some extent, proceed to the next step and use this book as a reference book to draw when you come across a word you do not understand.
Kame @ US Data Scientist's blog Kame-san's blog itself has a lot of things to study, so it's a good idea to find some time to read it, but this course is especially recommended. It is very easy to understand because the library that can be said to be used whenever doing machine learning is systematically organized. (I think the Kaggle Startbook, which I will introduce later, will be better understood after reading this article.)
(For this content introduction, the part of "Purpose of this course" written on the blog was written in an easy-to-understand manner, so I will quote from that part) The purpose of this course is to master the environment construction necessary for data science in Python, the basics of Python, the basics of Python libraries used in data science, and the'basic' usage of Python modules that frequently appear in data science. is.
What is the goal of this course?
Master the basics of using tools, libraries, and modules to process the data needed for data science in Python. Data can be processed without using spreadsheet tools such as Excel Can process data files such as image files You can automate daily data processing (Excel, etc.) with Python It is a place like that. I also mention some statistics, but please note that it is not a "learning course on data science" but a "learning course on Python for data science".
However, since the course includes abundant "techniques that can be used in the field" and "frequent techniques of data science", you can learn data science in a broad sense.
Anyway, I intend to write it in an easy-to-understand manner. I don't see many difficult words, and I'm explaining it in a very chewy manner, so I don't think it will stop in the middle.
Also, instead of teaching as a textbook, "how to actually use it in the field" is included everywhere. Therefore, it is "content that can be used in actual battles in a somewhat systematic and comprehensive manner". (Quoted so far)
What is being introduced is --The basics of python --NumPy (used for numerical calculation) --Pandas (for manipulating and analyzing data (spreadsheets like Excel can be done faster)) --matplotlib (for drawing graphs) --Seaborn (similar to matplotlib, you can draw graphs, but it's cleaner and easier to draw) --Other useful libraries, modules, etc.
First read (Iron rule!)
You can do this at the same time as 1, but move your hands to learn the execution results and movements. (I think that there are many things that you can't learn just by reading) → The [Kaggle](what is #kaggle) competition called titanic, which is the subject of this article, will also be used as a tutorial for the Kaggle Startbook, which will be introduced later, so I think that understanding can be further improved if it can be executed together.
Read repeatedly + read back from time to time
[amazon link] (https://www.amazon.co.jp/%E5%AE%9F%E8%B7%B5Data-Science%E3%82%B7%E3%83%AA%E3%83%BC%E3%82%BA-Python%E3%81%A7%E3%81%AF%E3%81%98%E3%82%81%E3%82%8BKaggle%E3%82%B9%E3%82%BF%E3%83%BC%E3%83%88%E3%83%96%E3%83%83%E3%82%AF-KS%E6%83%85%E5%A0%B1%E7%A7%91%E5%AD%A6%E5%B0%82%E9%96%80%E6%9B%B8-%E7%A5%A5%E5%A4%AA%E9%83%8E/dp/4065190061)
[What to do next after registering with Kaggle ~ You can fight enough if you do this! Two authors of Introduction to Titanic 10 Kernel ~ and kaggle Tutorial Is a tutorial book of [Kaggle](what is #kaggle) that I wrote in a tag (the two underlying books are both popular and easy-to-understand books (articles)!)
In the two steps introduced so far, there are some parts that are a little less practical, but I think that there are many things that can be gained by learning while actually using them. However, even so, the real intention is that I don't know where to start. Under such circumstances, participating in the "Titanic" competition, which is a tutorial for beginners of [Kaggle](what is #kaggle), along with this book is a good first step!
Chapter 1: What is Kaggle? Explains from to how to create an account (ideal for introduction) Chapter 2: Titanic Tutorial Chapter 3: Explanation of how to handle multiple tables and image / text data (It has also been introduced to competitions other than the titanic competition that I tried in Chapter 2) Chapter 4: Page with tips for learning more (I just did the tutorial and it doesn't end. There are contents that lead to the future)
Since the sample code is up, if you follow it, you need to register a Kaggle account, but you can also practice the whole process with almost non-coding. (I don't know the details, but if you want to get an overview, you may want to try it once and interpret the code.)
This page is a dialogue between the two authors, and I read not only the parts and perspectives that advanced users take for granted, but also the reasons for starting Kaggle and the good things, and there are many useful things to study. It will also be.
It is easy to understand because it summarizes Kaggle's peripheral knowledge and + α knowledge in a column format. (It may come out if you look it up yourself, but it is easy to understand, so it is appreciated for beginners)
(Since there are some parts that I haven't done yet, this is my current study plan.)
I tried to organize Kaggle's Competition Categories ← The Kaggle competition categories are written in an easy-to-understand manner!
Kaggle is the world's largest data science community with powerful tools and resources to help you reach your data science goals. ↑ Because there was a page in Mr. Harada's material of DeNA at Devsumi 2018 summer that it was easy to intuitively understand what Kaggle is. , Quoted.
I myself haven't had a long history of machine learning, and I think I'm a beginner, but thanks to the environment and teaching materials, I'm gradually getting to know it. So, in this article, I hope that by summarizing the basic learning methods, it will be useful for those who are interested in machine learning but do not know how to study by themselves. As I study from now on, I think that it is better to do this, and that it may be easier for other people (advanced and intermediate) to study. I hope I can update it from time to time or put it together in a separate article!
By the way, I think that the continuation of the 3 steps written in this article (although it may be the same as the practice of the 3 competition) can further improve the technique if you can refer to the following articles etc. Summary of recommended materials for machine learning beginners to finish Kaggle's "introduction" at high speed Mr. Murata, one of the authors of Kaggle Start Book (Curry) Chan)'s article.
Recommended (A podcast that talks about Kaggle themes and new competitions every week)
kaggler-ja slack kaggler's Japanese community Slack
Recommended Posts