This is the story of participating in the Kaggle </ b> competition for the first time. First, try it in the "Titanic" competition.
Read as Kaggle. Kaggle is a community site headquartered in the United States that is attended by people involved in data science and machine learning around the world. The highlight of Kaggle is the competition. A competition is a competition in which participants compete for the power of data analysis. Eventually I would like to participate in the competition and compete, but there seems to be a tutorial, so I would like to proceed from the place where I do the tutorial first.
https://www.kaggle.com/ Register as a member on the Kaggle site. Probably because I already have a Google account, "Welcome sudominoru" was displayed and I was already registered as a member.
I uploaded a photo from "Edit Profile" and registered "City".
For those who want to try kaggle, the first thing to try is the "Titanic Tutorial". I will participate in this competition.
If you select "Competitions" at the top of the screen Our Titanic Competition is a great first challenge to get started. (The Titanic Competition is a great first challenge to get started.) Is said, and Titanic is displayed at the top. Click and select Titanic.
Click "Join Competition" to participate in the Titanic competition.
Please read and accept the competition rules (Please read and agree to the competition rules) And this, so I agree after checking the rules. I was able to participate.
kaggle provides an environment for writing code. Let's write it right away.
From Notebooks, click your Work and then Create New Notebook.
Select the language and type ("Notebook" or "Script"). I chose "Python" as the language and "Notebook" as the Type. I think "Notebook" is like "Jupyter Notebook" and "Script" is an image of writing code in "Spyder". I usually write code in "Spyder", but in Kaggle I want to write the code while writing the explanation, so I will proceed with "Notebook".
A screen like "Jupyter Notebook" will be displayed as shown in the image above. In the input directory on the right side, there are training data (train.csv) and verification data (test.csv) used in this competition. "Gender_submission.csv" will be explained later, but it is sample data for submitting the competition. If you try to execute the sample code as it is, the file name will be output. Now you are ready to write your code.
Earlier, I said that "gender_submission.csv" is sample data for submitting competitions. Looking at the contents, there is the same PassengerId as "test.csv", and the number of cases is also the same. The flow is as follows.
As a test, let's output "gender_submission.csv" as it is.
# Any results you write to the current directory are saved as output.
# gender_submission.load csv
# Load gender_submission.csv
df_gender_submission = pd.read_csv('/kaggle/input/titanic/gender_submission.csv')
# gender_submission.Write csv to current directory
# Write gender_submission.csv to the current directory
df_gender_submission.to_csv('gender_submission.csv', index=False)
Add the above code and click "Commit" in the upper right to execute it.
After a while, the execution result will be displayed. Click Open Version.
The content of the result is displayed. You can see that "gender_submission.csv" is output to "Output Files" at the bottom of the screen. Click "Submit to Competition" to submit.
The transition to "Leaderboard" is displayed and the result is displayed. The Score is "0.76555". The correct answer rate was "76.5%". You can move to your own ranking with "Jump to your position on the leaderboard".
I somehow understood how to use Kaggle. Next time, I would like to proceed with learning at Titanic.
First Kaggle Tutorial [Introduction to Beginners] https://note.com/toshioakaneya/n/na582cb273153
2019/12/08 First edition released