[PYTHON] I checked the number of closed and opened stores nationwide by Corona

How many stores did Corona shut down?

Information on stores opened and closed nationwide is collected on a site called Opened and closed.com. For studying scraping, let's color map the percentage of store closures in March-April 2020 and 2019 for each prefecture.

Method

Let's go in the direction of counting the number of stores corresponding to each period in each region on the above site. Fortunately, it is categorized by region and each page is saved in descending order, so use that. Also, scraping uses python's beautifulsoup4.

Results / Discussion

Let's look at the results first (implementation is below). The result is as follows. The numbers for sidebar are for each prefecture ratio = $ (N_ {opened} --N_ {closed}) $ / $ (N_ {closed} + N_ {opened}) $ It is standardized in.

2020.png 2019.png

Looking at this, there is no significant difference due to coronavirus. It is necessary to look at the future trends to see whether the impact is simply small or the impact will appear with a time lag. I hope that the damage will be reduced somehow.

Implementation

First, collect the data from the above url address.

shop_openup_closedown_ratio_1.py



from bs4 import BeautifulSoup
from urllib import request
import datetime
import numpy as np

def period(year,month,year_s = 2019,year_e = 2019,month_s = 3,month_e = 4):
    res = False
    if (year<=year_e) & (year >=year_s) & (month>=month_s) & (month<=month_e):
        res = 2
    if year >= year_e:
        if year > year_e:
            res = 1
        else:
            if month>month_e:
                res = 1

    return res



def main(year_s = 2020,year_e = 2020,month_s = 4,month_e = 4):
    dic = {}
    states = ['close','open']
    for state in range(len(states)):
        url = 'https://kaiten-heiten.com/heiten/area-' + states[state] + '/'  
        response = request.urlopen(url)
        soup = BeautifulSoup(response,'html.parser')

        for a in soup.find_all('a', class_="links"):   
            link = a.get('href')
            region = a.text 
            print(region)

            if dic.get(region) is None:
                dic[region] = [0,0]



            url = link
            response = request.urlopen(url)
            soup = BeautifulSoup(response,'html.parser')

            shop_list = soup.find_all('span', class_='post_time')

            year_last = int(shop_list[-1].text[:5])
            month_last = int(shop_list[-1].text[6:8])

            for a in shop_list:        

                year = int(a.text[:5])
                month = int(a.text[6:8])


                if period(year, month,year_s,year_e,month_s,month_e) == 2:
                    dic[region][state] += 1

                cout = 0
                flag = 0

                while period(year_last, month_last,year_s,year_e,month_s,month_e)>=1:

                    next_p = soup.find('a',class_='next page-numbers')
                    
                    if soup.find_all('a',class_='next page-numbers') is not None:
                        link = next_p.get('href')
                    else:
                        break

                    url = link
                    response = request.urlopen(url)
                    soup = BeautifulSoup(response,'html.parser')


                    shop_list = soup.find_all('span', class_='post_time')

                    year_last = int(shop_list[-1].text[:5])
                    month_last = int(shop_list[-1].text[6:8])


                    print(year_last,month_last)

                    for a in shop_list:        

                        year = int(a.text[:5])
                        month = int(a.text[6:8])


                        if period(year, month):
                            dic[region][state] += 1

                            
    regions = list(dic.keys())
    vals_ = np.array(list(dic.values()))

    hk = int(22)

    hk_name = ['Hokkaido']
    hk_vals = np.array([vals_[:hk,0].sum(), vals_[:hk,1].sum()])

    regions = hk_name + regions[hk:]
    vals = [list(hk_vals)] + list(vals_[hk:])


    # ratio == N_o - N_c / N

    ratio = np.zeros(len(regions))

    for i in range(len(regions)):
        ratio[i] = (vals[i][1] - vals[i][0]) / (vals[i][1] + vals[i][0])
    
    return regions, vals, ratio, vals_


regions_2020, vals_2020, ratio_2020, vals_2020 = main(2020,2020,3,4)
regions_2019, vals_2019, ratio_2019, vals_2019 = main(2019,2019,3,4)

Get the data with.

shop_openup_closedown_ratio_2.py



import numpy as np
import cv2
from PIL import Image
import matplotlib.colors
import matplotlib.pyplot as plt
from japanmap import *



def mapping(regions,ratio,name,a=0.1,b=1):

    n_min = a
    n_max = b

    cmap = plt.cm.rainbow
    norm = matplotlib.colors.Normalize(vmin=n_min, vmax=n_max)

    def color_scale(r):
        tmp = cmap(norm(r))
        return (tmp[0]*255, tmp[1]*255, tmp[2]*255)

    dic = {}
    for k in range(len(regions)):
        map_val = color_scale(ratio[k])
        dic[regions[k]] = map_val

    lab = name + ' 3~4'
    fig = plt.figure(figsize=(15,9))
    plt.title(lab,fontsize=15)
    plt.imshow(picture(dic))


    sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
    
    plt.colorbar(sm)
    plt.show()
    
    fig.savefig(name)

Arrange the data with. Plot below.

shop_openup_closedown_ratio_3.py



a = list(ratio_2020)+list(ratio_2019)

max_n = max(a)
min_n = min(a)
mapping(regions_2019,ratio_2019,'2019')
mapping(regions_2020,ratio_2020,'2020',min_n,max_n)


Data source

Opening and closing.com

Recommended Posts

I checked the number of closed and opened stores nationwide by Corona
I checked out the versions of Blender and Python
I checked the default OS and shell of docker-machine
I checked the distribution of the number of video views of "Flag-chan!" [Python] [Graph]
Find out the age and number of winnings of prefectural governors nationwide
I checked the contents of docker volume
I checked the options of copyMakeBorder of OpenCV
I tried to verify and analyze the acceleration of Python by Cython
Divides the character string by the specified number of characters. In Ruby and Python.
I checked the list of shortcut keys of Jupyter
I checked the session retention period of django
I checked the processing speed of numpy one-dimensionalization
Minimize the number of polishings by combinatorial optimization
I read and implemented the Variants of UKR
[SLAYER] I visualized the lyrics of thrash metal and checked the soul of steel [Word Cloud]
relation of the Fibonacci number series and the Golden ratio
I checked the output specifications of PyTorch's Bidirectional LSTM
I tried increasing or decreasing the number by programming
I tried to tabulate the number of deaths per capita of COVID-19 (new coronavirus) by country
IDWR bulletin data scraping the number of reports per fixed point of influenza and by prefecture
I tried to get the number of days of the month holidays (Saturdays, Sundays, and holidays) with python
I tried to verify the yin and yang classification of Hololive members by machine learning
Analyzing data on the number of corona patients in Japan
I want to know the features of Python and pip
I want to map the EDINET code and securities number
I displayed the chat of YouTube Live and tried playing
What I saw by analyzing the data of the engineer market
The mystery of the number that can be seen just by arranging 1s-The number of repunits and mysterious properties-
I tried ranking the user name and password of phpMyAdmin that was targeted by the server attack
I tried to predict the number of domestically infected people of the new corona with a mathematical model
I tried fitting the exponential function and logistics function to the number of COVID-19 positive patients in Tokyo
10. Counting the number of lines
Get the number of digits
Count the number of Thai and Arabic characters well in Python
[Introduction to Python] I compared the naming conventions of C # and Python.
I found out by analyzing the reviews of the job change site! ??
[Python] I thoroughly explained the theory and implementation of logistic regression
Can I pass the first grade of math test by programming?
[Python] I thoroughly explained the theory and implementation of decision trees
I investigated the X-means method that automatically estimates the number of clusters
I summarized how to change the boot parameters of GRUB and GRUB2
I checked the usage status of the parking lot from satellite images.
I investigated the behavior of the difference between hard links and symbolic links
I checked the image of Science University on Twitter with Word2Vec.
I want to revive the legendary Nintendo combination by making full use of AI and HR Tech!
I tried to get and analyze the statistical data of the new corona with Python: Data of Johns Hopkins University