[PYTHON] IDWR bulletin data scraping the number of reports per fixed point of influenza and by prefecture

National Institute of Infectious Diseases has CSV of the same data, so scraping

from urllib.parse import urljoin

import requests
from bs4 import BeautifulSoup

url = "https://www.niid.go.jp/niid/ja/data.html"

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"
}

r = requests.get(url, headers=headers)
r.raise_for_status()

soup = BeautifulSoup(r.content, "html.parser")

tag = soup.select_one(
    'div.leading-0 > table > tbody > tr > td > p.body1 > a[href$="-teiten.csv"]'
)

link = urljoin(url, tag.get("href"))

import pandas as pd

df = pd.read_csv(
    link,
    encoding="cp932",
    skiprows=3,
    index_col=0,
    header=0,
    usecols=[0, 1, 2],
    na_values="-",
)

df1 = df[df.index.notna()]

Recommended Posts

IDWR bulletin data scraping the number of reports per fixed point of influenza and by prefecture
Data Langling PDF on the outbreak of influenza by the Ministry of Health, Labor and Welfare
Visualization of data by prefecture
I checked the number of closed and opened stores nationwide by Corona
Let's calculate the transition of the basic reproduction number of the new coronavirus by prefecture
[Python] Precautions when retrieving data by scraping and putting it in the list
Divides the character string by the specified number of characters. In Ruby and Python.
Scraping the rainfall data of the Japan Meteorological Agency and displaying it on M5Stack
Scraping the number of downloads and positive registrations of the new coronavirus contact confirmation app
Paste a link to the data point of the graph created by jupyterlab & matplotlib
[Python] Plot data by prefecture on a map (number of cars owned nationwide)
Minimize the number of polishings by combinatorial optimization
Scraping the winning data of Numbers using Docker
Let's put out a ranking of the number of effective reproductions of the new coronavirus by prefecture