[PYTHON] Visualize the response status of the census 2020

Introduction

kokusei2020.png

Visualize Internet response rate and mail response rate by scraping Excel of response status by prefecture in Census 2020

Scraping

import requests
from bs4 import BeautifulSoup

import re

from urllib.parse import urljoin

url = "https://www.kokusei2020.go.jp/internet/"

r = requests.get(url)
r.raise_for_status()

soup = BeautifulSoup(r.content, "html.parser")

links = {}

for i in soup.find_all("span", text="Excel"):

    link = urljoin(url, i.find_parent("a").get("href"))

    m = re.search("census_answers_(pref|city)_\d{6}.xlsx", link)

    if m:
        links[m.group(1)] = link

links

By prefecture

import pandas as pd

df_pref = pd.read_excel(
    links["pref"],
    index_col=[0, 1],
    header=None,
    skiprows=9,
    usecols=[1, 2, 3, 4, 5, 6, 7],
    names=["code", "Prefectures", "Number of H27 households", "Net", "By mail", "Net率", "By mail率"],
)

df_pref["Number of responses"] = df_pref["Net"] + df_pref["By mail"]

df_pref["Net rate"] *= 100
df_pref["Mailing rate"] *= 100

df_pref["Response rate"] = df_pref["Net rate"] + df_pref["Mailing rate"]

df_pref.to_csv("pref.csv", encoding="utf_8_sig")

By municipality

df_city = pd.read_excel(
    links["city"],
    index_col=[0, 1, 2],
    header=None,
    skiprows=9,
    usecols=[1, 2, 3, 4, 5, 6, 7, 8],
    names=["code", "Prefectures", "Municipality", "Number of H27 households", "Net", "By mail", "Net率", "By mail率"],
)

df_city["Number of responses"] = df_city["Net"] + df_city["By mail"]

df_city["Net rate"] *= 100
df_city["Mailing rate"] *= 100

df_city["Response rate"] = df_city["Net rate"] + df_city["Mailing rate"]

df_city.to_csv("city.csv", encoding="utf_8_sig")

df_city

Visualization

import matplotlib.pyplot as plt
import seaborn as sns

sns.set()

import japanize_matplotlib

#resolution
import matplotlib as mpl

mpl.rcParams["figure.dpi"] = 200

df1 = df_pref.sort_index(ascending=False).reset_index(level="code", drop=True)

df1.loc[:, ["Net rate", "Mailing rate"]].plot.barh(stacked=True, figsize=(5, 10))

plt.legend(bbox_to_anchor=(1.05, 1), loc="upper left", borderaxespad=0, fontsize=8)

plt.savefig("01.png ", dpi=200, bbox_inches="tight")
plt.show()

pref.png

Recommended Posts

Visualize the response status of the census 2020
Visualize the orbit of Hayabusa2
Visualize the appreciation status of art works with OpenCV
Visualize the export data of Piyo log
Regularly monitor the HTTP response of the web server
Visualize the behavior of the sorting algorithm with matplotlib
Check the status of your data using pandas_profiling
I want to visualize the transfer status of the 2020 J League, what should I do?
The meaning of self
the zen of Python
Visualize the range of interpolation and extrapolation with python
The story of sys.path.append ()
Visualize the characteristic vocabulary of a document with D3.js
I tried to visualize the spacha information of VTuber
Understand the status of data loss --Python vs. R
Get the operation status of JR West with Python
Visualize the number of complaints from life insurance companies
Revenge of the Types: Revenge of types
Let's visualize the trading volume of TSE stocks --jpxlab sample
Visualize the results of decision trees performed with Python scikit-learn
Django returns the contents of the file as an HTTP response
[Blender] Know the selection status of hidden objects in the outliner
Visualize the "regional color" of the city by applying document vectorization
Check the memory status of the server with the Linux free command
Check the operating status of the server with the Linux top command
Let's guess the development status of the city from the satellite image.
[Python] I tried to visualize the follow relationship of Twitter
Let's visualize the number of people infected with coronavirus with matplotlib
Visualize the flow rate of tweets with Diamond + Graphite + Grafana
[Flask & Bootstrap] Visualize the content of lyrics in Word Cloud ~ Lyrics Word Cloud ~
Align the version of chromedriver_binary
Scraping the result of "Schedule-kun"
10. Counting the number of lines
Towards the retirement of Python2
Get the number of digits
Explain the code of Tensorflow_in_ROS
Reuse the results of clustering
Let's visualize GraphConvModel of DeepChem
GoPiGo3 of the old man
Calculate the number of changes
Change the theme of Jupyter
The popularity of programming languages
Change the style of matplotlib
About the components of Luigi
Connected components of the graph
Filter the output of tracemalloc
About the features of Python
Simulation of the contents of the wallet
The Power of Pandas: Python
Vertically visualize the amount corresponding to the vertices of networkx using Axes3D
Check the HTTP status code response with the curl command (#Linux #Shell)
About Boxplot and Violinplot that visualize the variability of independent data
Visualize the center of the rank battle environment from the Pokemon Home API
Visualize the frequency of word occurrences in sentences with Word Cloud. [Python]
I checked the usage status of the parking lot from satellite images.
[Word2vec] Let's visualize the result of natural language processing of company reviews
I tried to visualize the common condition of VTuber channel viewers