[PYTHON] I tried to summarize the new coronavirus infected people in Ichikawa City, Chiba Prefecture

Introduction

I have summarized the data of people infected with the new coronavirus in Ichikawa City, Chiba Prefecture, where I live.

In the first place, Ichikawa City Homepage does not disclose information in a format that can be secondarily used as open data. It's not a lot of data, there are few items, and it's not enough to try something with this, but it seems that it can be used for small things, so I tried to make it easy to use. I also posted the sample code (Python).

It is updated from time to time, but it may be delayed due to personal reasons.

[2020/05/08] Added death date

URL https://github.com/mine820/COVID-19

data

In CSV format, the character code is UTF-8.

column

The meanings of the columns are as follows.

--Classification --Patient (already affected) or asymptomatic pathogen carrier (not yet developed) --City --The order in which infections were found among residents of Ichikawa City. --Prefecture --The order in which infections were found in Chiba residents.

sample

Sample code for analysis using data. The file is a Jupyter Notebook.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

df = pd.read_csv('corona.csv')

df["Date of onset"] = df["Date of onset"].replace("unknown", "")
df["Date of onset"] = df["Date of onset"].replace("investigating", "")
df["Date of onset"] = pd.to_datetime(df["Date of onset"], format="%Y-%m-%d")

df["Inspection confirmation date"] = df["Inspection confirmation date"].replace("unknown", "")
df["Inspection confirmation date"] = df["Inspection confirmation date"].replace("investigating", "")
df["Inspection confirmation date"] = pd.to_datetime(df["Inspection confirmation date"], format="%Y-%m-%d")

df["Date of death"] = df["Date of death"].replace("unknown", "")
df["Date of death"] = df["Date of death"].replace("investigating", "")
df["Date of death"] = pd.to_datetime(df["Date of death"], format="%Y-%m-%d")

#Summary statistics
df.describe().loc[:,"Year"]

#Histogram (age)
plt.title("Age")
plt.yticks([0,5,10,15,20])
plt.hist(df["Year"], range=(0, 100));

#Inspection confirmation date + moving average (7 days)
days = (df["Inspection confirmation date"].max()-df["Inspection confirmation date"].min()).days
hist = plt.hist(df["Inspection confirmation date"], bins=days)

left = np.array(range(days))

num = 7
b = np.ones(num) / num
y2 = np.convolve(hist[0], b, mode='same')

plt.title("Inspection confirmation date")
plt.bar(left, hist[0], color='green');
plt.plot(y2, color='red')

image1.png image2.png image3.png image4.png

Recommended Posts

I tried to summarize the new coronavirus infected people in Ichikawa City, Chiba Prefecture
I tried to summarize the code often used in Pandas
I tried to visualize the characteristics of new coronavirus infected person information with wordcloud
I tried to predict the number of people infected with coronavirus in Japan by the method of the latest paper in China
I tried to summarize the commands often used in business
I tried to predict the number of people infected with coronavirus in consideration of the effect of refraining from going out
I tried to summarize the umask command
I tried to summarize the graphical modeling.
I tried to predict the number of domestically infected people of the new corona with a mathematical model
[Python] I tried to summarize the set type (set) in an easy-to-understand manner.
I tried to predict the behavior of the new coronavirus with the SEIR model.
[Python] The status of each prefecture of the new coronavirus is only published in PDF, but I tried to scrape it without downloading it.
LeetCode I tried to summarize the simple ones
I tried to automatically send the literature of the new coronavirus to LINE with Python
I tried to graph the packages installed in Python
I tried to summarize the basic form of GPLVM
I tried to summarize how to use pandas in python
I tried to summarize the string operations of Python
[Series for busy people] I tried to summarize by parsing to call news in 30 seconds
I tried to publish GraphQL API of COVID19 infected person situation in Hyogo prefecture.
I tried to summarize SparseMatrix
[First COTOHA API] I tried to summarize the old story
I tried to illustrate the time and time in C language
I tried to implement the mail sending function in Python
[Machine learning] I tried to summarize the theory of Adaboost
I tried to summarize how to use the EPEL repository again
I tried to publish GraphQL API of COVID 19 infected person situation in Hyogo prefecture. (Part 2)
I tried to tabulate the number of deaths per capita of COVID-19 (new coronavirus) by country
Convert PDF of the situation of people infected in Tokyo with the new coronavirus infection of the Tokyo Metropolitan Health and Welfare Bureau to CSV
I tried to describe the traffic in real time with WebSocket
[Linux] I tried to summarize the command of resource confirmation system
I tried to process the image in "sketch style" with OpenCV
I tried to summarize the commands used by beginner engineers today
I tried to summarize the contents of each package saved by Python pip in one line
I tried to process the image in "pencil style" with OpenCV
If the people of Tokyo become seriously ill with the new coronavirus, they may be taken to a hospital in Kagoshima prefecture.
I tried to summarize the frequently used implementation method of pytest-mock
I tried to move the ball
I tried to estimate the interval.
I tried to summarize all the Python plots used in the research by active science graduate students [Basic]
Create a bot that posts the number of people positive for the new coronavirus in Tokyo to Slack
I tried to summarize the general flow up to service creation by self-education.
I tried to summarize Cpaw Level1 & Level2 Write Up in an easy-to-understand manner
I tried to summarize various sentences using the automatic summarization API "summpy"
I tried to summarize the logical way of thinking about object orientation.
I tried to summarize Cpaw Level 3 Write Up in an easy-to-understand manner
I tried to streamline the standard role of new employees with Python
I tried to summarize the Linux commands used by beginner engineers today-Part 1-
I tried to display the altitude value of DTM in a graph
I implemented the VGG16 model in Keras and tried to identify CIFAR10
I tried to analyze the New Year's card by myself using python
I tried to train the RWA (Recurrent Weighted Average) model in Keras
I tried to summarize Python exception handling
I tried to implement PLSA in Python
I tried to implement permutation in Python
I tried to recognize the wake word
I tried to implement PLSA in Python 2
Python3 standard input I tried to summarize
(Now) I analyzed the new coronavirus (COVID-19)
I tried to estimate the pi stochastically
I tried to touch the COTOHA API