[GO] A Python beginner first tried a quick and easy analysis of weather data for the last 10 years.

Introduction

I recently started studying Python. Since it's a big deal, I'd like to post a useful post for those who are just starting to touch Python like myself. Since I have never written Python, this time I would like to perform simple data analysis while understanding the syntax of python and what kind of library there is.

Reference book: [Ready to use! Can be practiced in business! How to make AI / machine learning / deep learning apps with Python](https://www.amazon.co.jp/%E3%81%99%E3%81%90% E3% 81% AB% E4% BD% BF% E3% 81% 88% E3% 82% 8B-% E6% A5% AD% E5% 8B% 99% E3% 81% A7% E5% AE% 9F% E8 % B7% B5% E3% 81% A7% E3% 81% 8D% E3% 82% 8B-Python% E3% 81% AB% E3% 82% 88% E3% 82% 8B-AI% E3% 83% BB % E6% A9% 9F% E6% A2% B0% E5% AD% A6% E7% BF% 92% E3% 83% BB% E6% B7% B1% E5% B1% A4% E5% AD% A6% E7 % BF% 92% E3% 82% A2% E3% 83% 97% E3% 83% AA% E3% 81% AE% E3% 81% A4% E3% 81% 8F% E3% 82% 8A% E6% 96 % B9-% E3% 82% AF% E3% 82% B8% E3% 83% A9% E9% A3% 9B% E8% A1% 8C% E6% 9C% BA / dp / 4802611641)

Procedure & preparation

** Execution environment ** Google Colaboratory --You can access the python library without installing pip. ――It's free, and if you have a Google account, you can execute the code immediately without the trouble of installation etc.

** Library to use **

Implementation

First of all, prepare the data necessary for the analysis.

Data preparation

To analyze the weather data, download the dataset from the following URL. You can download it directly, but let's install it using a library called urllib.

#Access a function called urlretrieve from urllib
from urllib.request import urlretrieve
#Prepare a variable called filename. File name tempreture.csv
filename = "tempreture.csv"
#Specify url
url = "https://raw.githubusercontent.com/kujirahand/mlearn-sample/master/tenki2006-2016/kion10y.csv"
#Read the url and the temperature described in the line above.Save the data as a file named csv
urlretrieve(url, filename)

Next, let's display the acquired data

Pandas is a library for efficient data analysis in Python. Pandas makes it easy for you to do data analysis tasks such as loading data, displaying statistics, and graphing.

#When importing and using pandas, it seems that it is common to write pd, so use pd.
import pandas as pd
#To see the contents of the csv file you got earlier, read_csv()Use the
pd.read_csv(filename)

As a result of running, I found that the data is 4018 rows x 6 columns.

スクリーンショット 2019-12-13 16.10.33.png

Check the average temperature


#[Make the data of the past 10 years into a dictionary type and make it easy to program]

history = {}
#Get index and data for each row. Same as enumurate function in other languages
for i, row in df.iterrows():
  #Substitute the monthly temperature into each variable
  month, day, tempreture = (int(row['Month']), int(row['Day']), float(row['temperature']))
  #key to "12"/Make it look like 25 "
  key = str(month) + "/" + str(day)
  #Judgment is made so that the same key is not duplicated
  if not(key in history): history[key] = []
  #If there is no duplication, add it to history
  history[key] += [tempreture]

# [Find the average value]

average = {}
#Loop history and get key
for key in history:
  #Link the calculated average value to the key and add it to average
  average[key] = sum(history[key]) / len(history[key])
  result = average[key]
  # print("{0}: {1}".format(key, result))

Let's check the average temperature of one day

import math

#function to check type(To accept only character strings)
def isString(date):
  return type(date) is str

#Get the average value of the specified date from the dictionary type average
def getTempreture(date):
  if isString(date):
    return average[date]

tempreture = getTempreture("12/25")
value = round(tempreture)
#Type conversion int to string
print(str(value)+ "Degree")

Try to draw

#Import matplotlib to draw the graph
import matplotlib.pyplot as plt

#Processing to divide temperature data by month
tempreture_per_month = df.groupby(['Month'])['temperature']
#Sum the divided temperature data monthly and divide it by the number of data per month
average_tempreture = tempreture_per_month.sum() / tempreture_per_month.count()
#draw
average_tempreture.plot()

I was able to draw. スクリーンショット 2019-12-13 16.31.31.png

Impressions

--It is recommended for beginners to start with Google Colaboratory because it saves the trouble of installation and other work. ――I was able to manipulate and draw data more easily than I expected, and I learned how great the python library is.

Since I touched python for the first time, there are still many things I do not understand, but I will continue learning so that I can gradually perform advanced analysis.

Recommended Posts

A Python beginner first tried a quick and easy analysis of weather data for the last 10 years.
The first time a programming beginner tried simple data analysis by programming
Note: Get the first and last items of Python OrderedDict non-destructively
[Python] How to get the first and last days of the month
The story of returning to the front line for the first time in 5 years and refactoring Python Django
Pandas of the beginner, by the beginner, for the beginner [Python]
I tried logistic regression analysis for the first time using Titanic data
A summary of Python e-books that are useful for free-to-read data analysis
I tried python programming for the first time.
A well-prepared record of data analysis in Python
A discussion of the strengths and weaknesses of Python
[Introduction to Python] How to get the index of data with a for statement
Until you get daily data for multiple years of Japanese stocks and save it in a single CSV (Python)
List of Python libraries for data scientists and data engineers
I tried python on heroku for the first time
A quick comparison of Python and node.js test libraries
[Data analysis for 5 years] Is the stock price rising a few days after the Golden Cross?
I tried the same data analysis with kaggle notebook (python) and Power BI at the same time ①
[Understand in the shortest time] Python basics for data analysis
Build a Python environment and transfer data to the server
<Python> Build a dedicated server for Jupyter Notebook data analysis
[Python] I tried collecting data using the API of wikipedia
Google search for the last line of the file in Python
Python for Data Analysis Chapter 4
Python for Data Analysis Chapter 3
[First scraping] I tried to make a VIP character of Smash Bros. [Beautiful Soup] [Data analysis]
The story of releasing a Python text check tool on GitHub x CircleCI for the first time
A memorandum of understanding for the Python package management tool ez_setup
Build and test a CI environment for multiple versions of Python
A simple data analysis of Bitcoin provided by CoinMetrics in Python
The story of making a standard driver for db with python.
Practice of data analysis by Python and pandas (Tokyo COVID-19 data edition)
Create a USB boot Ubuntu with a Python environment for data analysis
The story of Python and the story of NaN
First Python 3 ~ The beginning of repetition ~
A super beginner who does not know the basics of Python tried to graph the stock price of GAFA
Recognize the contour and direction of a shaped object with OpenCV3 and Python3 (Principal component analysis: PCA, eigenvectors)
Preprocessing template for data analysis (Python)
I tried to get and analyze the statistical data of the new corona with Python: Data of Johns Hopkins University
See python for the first time
Calculate the shortest route of a graph with Dijkstra's algorithm and Python
SE, a beginner in data analysis, learns with the data science unit vol.1
Summarize the main points of growth hacks for web services and the points of analysis
Latin learning for the purpose of writing a Latin sentence analysis program (Part 1)
A story of a person who started aiming for data scientist from a beginner
I have 0 years of programming experience and challenge data processing with python
I tried to verify and analyze the acceleration of Python by Cython
A beginner of machine learning tried to predict Arima Kinen with python
Quickly build a python environment for deep learning and data science (Windows)
I tried to perform a cluster analysis of customers using purchasing data
A useful note when using Python for the first time in a while
I measured the speed of list comprehension, for and while with python2.7.
Get the key for the second layer migration of JSON data in python
Since I'm free, the front-end engineer tried Python (v3.7.5) for the first time.
I tried tensorflow for the first time
Connect a lot of Python or and and
A quick overview of the Linux kernel
[Python] First data analysis / machine learning (Kaggle)
MongoDB for the first time in Python
Easy introduction of python3 series and OpenCV3
Various ways to read the last line of a csv file in Python