[PYTHON] Kaggle for the first time (kaggle ①)

Introduction

This is the story of participating in the Kaggle </ b> competition for the first time. First, try it in the "Titanic" competition.

table of contents

  1. What is Kaggle?
  2. Register as a member of Kaggle
  3. Participate in the Titatic competition
  4. Write the code
  5. Submit the learning results
  6. Summary reference

1. What is Kaggle?

Read as Kaggle. Kaggle is a community site headquartered in the United States that is attended by people involved in data science and machine learning around the world. The highlight of Kaggle is the competition. A competition is a competition in which participants compete for the power of data analysis. Eventually I would like to participate in the competition and compete, but there seems to be a tutorial, so I would like to proceed from the place where I do the tutorial first.

2. Register as a member of Kaggle

https://www.kaggle.com/ Register as a member on the Kaggle site. Probably because I already have a Google account, "Welcome sudominoru" was displayed and I was already registered as a member.

20191208_01.png

I uploaded a photo from "Edit Profile" and registered "City".

3. Participate in the competition

For those who want to try kaggle, the first thing to try is the "Titanic Tutorial". I will participate in this competition.

20191208_02.png

If you select "Competitions" at the top of the screen Our Titanic Competition is a great first challenge to get started. (The Titanic Competition is a great first challenge to get started.) Is said, and Titanic is displayed at the top. Click and select Titanic. 20191208_03.png

Click "Join Competition" to participate in the Titanic competition.

20191208_04.png

Please read and accept the competition rules (Please read and agree to the competition rules) And this, so I agree after checking the rules. I was able to participate.

4. Write the code

kaggle provides an environment for writing code. Let's write it right away. 20191208_05.png

From Notebooks, click your Work and then Create New Notebook.

20191208_06.png

Select the language and type ("Notebook" or "Script"). I chose "Python" as the language and "Notebook" as the Type. I think "Notebook" is like "Jupyter Notebook" and "Script" is an image of writing code in "Spyder". I usually write code in "Spyder", but in Kaggle I want to write the code while writing the explanation, so I will proceed with "Notebook".

20191208_07.png

A screen like "Jupyter Notebook" will be displayed as shown in the image above. In the input directory on the right side, there are training data (train.csv) and verification data (test.csv) used in this competition. "Gender_submission.csv" will be explained later, but it is sample data for submitting the competition. If you try to execute the sample code as it is, the file name will be output. Now you are ready to write your code.

5. Submit the learning results

Earlier, I said that "gender_submission.csv" is sample data for submitting competitions. Looking at the contents, there is the same PassengerId as "test.csv", and the number of cases is also the same. The flow is as follows.

    1. Learn using "train.csv"
  1. Verify the learning result using "test.csv". Output the verification result to "gender_submission.csv"
  2. Submit "gender_submission.csv"

As a test, let's output "gender_submission.csv" as it is.

# Any results you write to the current directory are saved as output.

# gender_submission.load csv
# Load gender_submission.csv
df_gender_submission = pd.read_csv('/kaggle/input/titanic/gender_submission.csv')
# gender_submission.Write csv to current directory
# Write gender_submission.csv to the current directory
df_gender_submission.to_csv('gender_submission.csv', index=False)

Add the above code and click "Commit" in the upper right to execute it.

20191208_08.png

After a while, the execution result will be displayed. Click Open Version.

20191208_09.png

The content of the result is displayed. You can see that "gender_submission.csv" is output to "Output Files" at the bottom of the screen. Click "Submit to Competition" to submit.

20191208_10.png

The transition to "Leaderboard" is displayed and the result is displayed. The Score is "0.76555". The correct answer rate was "76.5%". You can move to your own ranking with "Jump to your position on the leaderboard".

6. Summary

I somehow understood how to use Kaggle. Next time, I would like to proceed with learning at Titanic.

reference

First Kaggle Tutorial [Introduction to Beginners] https://note.com/toshioakaneya/n/na582cb273153

History

2019/12/08 First edition released

Recommended Posts

Kaggle for the first time (kaggle ①)
Kaguru for the first time
[For self-learning] Go2 for the first time
See python for the first time
Start Django for the first time
I tried tensorflow for the first time
MongoDB for the first time in Python
Let's try Linux for the first time
I tried using scrapy for the first time
How to use MkDocs for the first time
[Note] Deploying Azure Functions for the first time
I tried python programming for the first time.
I tried Mind Meld for the first time
Try posting to Qiita for the first time
What I got into Python for the first time
I tried Python on Mac for the first time.
Register a task in cron for the first time
I tried python on heroku for the first time
For the first time, I learned about Unix (Linux).
AI Gaming I tried it for the first time
Summary of stumbling blocks in Django for the first time
Introducing yourself at Qiita for the first time (test post)
I tried the Google Cloud Vision API for the first time
If you're learning Linux for the first time, do this!
First time python
Differences C # engineers felt when learning python for the first time
Qiita's first post (the reason for starting)
Challenges for the Titanic Competition for Kaggle Beginners
Python Master RTA for the time being
First time python
I tried logistic regression analysis for the first time using Titanic data
Impressions and memorandums when working with VS code for the first time
For the first time in Numpy, I will update it from time to time
A useful note when using Python for the first time in a while
Since I'm free, the front-end engineer tried Python (v3.7.5) for the first time.
For the time being, import them into jupyter
Make a histogram for the time being (matplotlib)
Use logger with Python for the time being
Run yolov4 "for the time being" on windows
I played with Floydhub for the time being
Try using LINE Notify for the time being
virtualenv For the time being, this is all!
The first GOLD "JDBC"
The first GOLD "Function"
Looking back on the machine learning competition that I worked on for the first time
Let's display a simple template that is ideal for Django for the first time
GTUG Girls + PyLadiesTokyo Meetup I went to machine learning for the first time
Flow memo to move LOCUST for the time being
Run with CentOS7 + Apache2.4 + Python3.6 for the time being
[Python] Measures and displays the time required for processing
Molecular dynamics simulation to try for the time being
I will install Arch Linux for the time being.
Next to Excel, for the time being, jupyter notebook
Import audit.log into Splunk and check the behavior when Splunk is started for the first time
After attending school, I participated in SIGNATE's BEGINNER limited competition for the first time.
I want to create a lunch database [EP1] Django study for the first time
I want to create a lunch database [EP1-4] Django study for the first time
For the G test 2020 # 2 exam
I want to move selenium for the time being [for mac]
[For beginners] kaggle exercise (merucari)
I tried running PIFuHD on Windows for the time being