[PYTHON] Kaggler, a student who cannot analyze data, analyzes himself

Introduction

Self-introduction

My name is Chizuchizu. I'm 14 years old and I'm 2nd. It's a multi-hobby type, but I was especially typing. (Past tense)

Now I'm a low-skilled Ikiri Kaggler.

I won a bronze medal at the "tentatively" IEEE. I will understand later why it is "for the time being".

After getting tired of AtCoder, I started Kaggle. I will definitely become a Master by myself.

The reason I decided to write

I wanted to write an Advent calendar. I thought I'd write an article, but I'm not at the level of introducing technology in terms of skill right now, so I'd like to convey the current state of Kaggle's slump in a realistic way.

After all, I thought Kaggle was like a competition pro, but I completely licked it. You can't live without doing what you do ...

In summary, it is an article that ** Kaggler who cannot analyze data analyzes himself ** (self-development scientist)

It may be difficult to read with only letters, but I would be happy if you could read it. Also, please give me some noisy advice. Please.

The lessons are summarized below.

Current status of Kaggle's downturn

I thought it would be better to write my current situation first than to write the past and future. I'm participating in various competitions now, but I'm in a slump, so there are many things I don't understand.

Language barrier

I speak Japanese.I'm not good at English. I'm learning English. I should keep learning a lot:) My mother tongue is Japanese, so I'm not good at English. (I'm trying to learn a little by doing online English conversation)

Kaggle is all in English. Discussion too. Of course, you can write it in Japanese, but it will be a discussion between Japanese people, so it seems that you should use English even if you use Google Translate.

Of course, it's a foreign language, so it's difficult, but if you read it properly, you'll understand. (Especially because notebook has graphs and codes) I thought it was just an excuse that I couldn't do because of the language.

If you don't speak strangely, you can answer most questions in junior high school English. (Normally on GitHub issues and discussions)

It's not unreadable with Google Translate, but I'm still a little reluctant. I set all my smartphones to English so that I can come into contact with English on a daily basis ... (But I'm not used to it yet)

I don't understand the essence

I think it's a problem that accompanies the language barrier, but I don't know the characteristics of the data in the first place, so I don't know what to analyze. I just read EDA and end the day repeatedly without knowing it. It's closed. Is it a million years early for you to read the EDA kernel without understanding "what you want to do"? I regret it.

How many times have you been cheating?

When the days I don't understand are repeated, I become more and more depressed and lose my motivation. At the time of the Mynavi competition, I was full of motivation because I knew what to do because I was in Japanese.

I'm motivated, but I feel like I'm doing a beach flag in the Gori fog and I want to give up

I want to read the Kaggle book and try to find a way to see the front ...

I can't be a beginner in glitter programming

If anything, it's just a contributor to Existence Darkness.

There are only darkness incidents such as a build failure that destroys the entire environment and initializes the OS, or a memory swap overflows that fills a 1TB SSD and makes it unbootable.

When I look at Kaggle's discussion, many people find interesting features, and I'm impressed that it's amazing, but I wonder how many hours I've been staring at the data by then.

If I can understand the essence, I think it's this one. I know what to do, so I'm going to make more and more hypotheses, experiment, prove it, raise the score, and happily increase my life monotonously. Can I become a glitter programmer if I can understand the essence?

The story of a crash trying to become a cool programmer

It crashed as a result of pursuing the ideal image. The reason is simple: I lost sight of the essence (data science) because I could write time in a place that was not the essence. I thought I'd use Git as well, but I'm writing code on a whim, so I forget to commit. If you use class, it is troublesome to check the variables inside, so I will write it like a notebook as much as possible.

What happened in the IEEE competition

That competition was the first competition I won a medal with kaggle. However, I couldn't be proud of it because I finished without knowing anything ... because I didn't understand the data and proceeded without knowing it. It just went up. Some may be capable, but I didn't do much and didn't understand anything, so I realized that the medal wasn't done. For the above reasons, I wrote that I got a "tentative" medal.

I want to win a majestic medal in the next competition ...

Summary

Children who cannot read textbooks are in a state. (I bought this book about a year ago)

It is written that humans without reading comprehension will change to AI, but if the humans who make AI do not have reading comprehension, it will fall over ...

I thought it was necessary to improve the reading comprehension of the data, or the reading comprehension of the competition.

Reflections and lessons learned

Understand the purpose

I thought this was probably the most important thing. Until now, I had to look at the data and code it without knowing the "purpose", but it doesn't work at all because it is somehow. Moreover, I don't understand the discussion well (because I don't understand the purpose)

For the time being, I realized that I should start by understanding the purpose and clarify what I should do before proceeding with the competition. I think I'll read the overview properly if it's just a matter of lazily doing mysterious EDA or analyzing something I don't understand. (It seems to be helpful for the starter kernel)

Output (?)

I don't think this applies to everyone because I'm a beginner, but I think I'm probably the one who should output.

As I mentioned earlier, I can't proceed unless I understand the purpose, but what if I had a tongue-in-cheek understanding? You! What are you doing! Wrong! Unless you are stabbed (extremely), you will not be able to operate normally.

In some cases, the output may be evaluated, so I wondered if it should be done as much as possible within the range of common sense.

In the past, there was a Japanese Kernel mixed with black history (I wrote the Kernel in Japanese even though I wrote it in English), but I think there was no loss in writing it.

Don't be discouraged

This is a lesson learned from the Mynavi competition. Actually, the Mynavi competition was a little solo at first, but I gave up because the accuracy did not improve at all, but I will do it for the team after merging the teams! !! !! !! I found new discoveries when I researched and analyzed various things with feelings.

Get in the habit of taking notes

I felt that I should make a note of everything I did today (experiments and results) on both the Jupyter Notebook and the Slack Single Channel. After three days, I forget about humans, and I can't remember what I did day by day in a two-month competition. I thought I should take notes to understand what I should do and to avoid unnecessary experimentation. (I don't think it's necessary to write it long)

The past logs of Slack were also helpful during the Mynavi competition. I wasn't really aware of it when I participated individually, but I noticed it when I worked as a team.

in conclusion

Thank you for reading.

By writing this article, I feel like I can escape from the downturn.

I don't think this is all, and after all it's less than Kaggle Expert, so it may be far from the optimal solution. I wonder if Kaggle Grand Masters can do data science to breathe unconsciously ... (I don't know because it's someone else)

But I hope this article helps someone.

Please comment on anything!

I'm a beginner who hasn't read the letter "D" in data science yet, but I'll improve my skills little by little. It's a slump, but Kaggle hasn't given up yet.

I want to become a "true" data scientist and be able to write Qiita articles again.

Recommended Posts

Kaggler, a student who cannot analyze data, analyzes himself
Use a cool graph to analyze PES data!