[PYTHON] Scraping the member stores of Go To EAT in Osaka Prefecture and converting them to CSV

Scraping member stores of Go To Eat Osaka Campaign

import time

import requests
from bs4 import BeautifulSoup

import pandas as pd

result = []

url = "https://goto-eat.weare.osaka-info.jp/?search_element_0_0=2&search_element_0_1=3&search_element_0_2=4&search_element_0_3=5&search_element_0_4=6&search_element_0_5=7&search_element_0_6=8&search_element_0_7=9&search_element_0_8=10&search_element_0_9=11&search_element_0_cnt=10&search_element_1_0=12&search_element_1_1=13&search_element_1_2=14&search_element_1_3=15&search_element_1_4=16&search_element_1_5=17&search_element_1_6=18&search_element_1_7=19&search_element_1_8=20&search_element_1_9=21&search_element_1_10=22&search_element_1_11=23&search_element_1_12=24&search_element_1_13=25&search_element_1_14=26&search_element_1_15=27&search_element_1_16=28&search_element_1_17=29&search_element_1_cnt=18&searchbutton=%E5%8A%A0%E7%9B%9F%E5%BA%97%E8%88%97%E3%82%92%E6%A4%9C%E7%B4%A2%E3%81%99%E3%82%8B&csp=search_add&feadvns_max_line_0=2&fe_form_no=0"

while True:

    r = requests.get(url)
    r.raise_for_status()

    soup = BeautifulSoup(r.content, "html.parser")

    for li in soup.select("div.search_result_box > ul > li"):

        data = {}
        data["Store name"] = li.select_one("p.name").get_text(strip=True)
        data["Genre"], data["area"] = li.select_one("ul.tag_list").stripped_strings

        for tr in li.table.select("tr"):

            k = tr.th.get_text(strip=True)

            if k == "Street address":
                v = list(tr.td.stripped_strings)

                data["Postal code"] = v[0]
                data[k] = " ".join(v[-1].split())
            else:
                data[k] = tr.td.get_text(strip=True)

        result.append(data)

    tag = soup.select_one("div.wp-pagenavi > a.nextpostslink")

    if tag:

        url = tag.get("href")

    else:
        break

    time.sleep(1)

df = pd.DataFrame(result).reindex(
    columns=["Store name", "Genre", "area", "Postal code", "Street address", "TEL", "business hours", "Regular holiday"]
)

df.to_csv("osaka.csv", encoding="utf_8_sig")

Recommended Posts

Scraping the member stores of Go To EAT in Osaka Prefecture and converting them to CSV
Scraping the list of Go To EAT member stores in Fukuoka prefecture and converting it to CSV
Scraping the list of Go To EAT member stores in Niigata prefecture and converting it to CSV
Convert PDF of Go To EAT member stores in Ishikawa prefecture to CSV
Convert PDF of list of Go To EAT member stores in Niigata prefecture to CSV
Convert PDF of available stores of Go To EAT in Kagoshima prefecture to CSV
Convert PDF of Kumamoto Prefecture Go To EAT member store list to CSV
Convert PDF of Chiba Prefecture Go To EAT member store list to CSV (command)
The story of creating a store search BOT (AI LINE BOT) for Go To EAT in Chiba Prefecture (1)
The story of creating a store search BOT (AI LINE BOT) for Go To EAT in Chiba Prefecture (2) [Overview]
Even in the process of converting from CSV to space delimiter, seriously try to separate input / output and rules
Scraping PDF of the status of test positives in each prefecture of the Ministry of Health, Labor and Welfare
Scraping the schedule of Hinatazaka46 and reflecting it in Google Calendar
Create a function to get the contents of the database in Go
Predict the amount of electricity used in 2 days and publish it in CSV
Comparing the basic grammar of Python and Go in an easy-to-understand manner
Hit the Rakuten Ranking API to save the ranking of any category in CSV
Convert PDF of new corona outbreak case in Aichi prefecture to CSV
Function to extract the maximum and minimum values ​​in a slice with Go
Various ways to read the last line of a csv file in Python
I stumbled on the character code when converting CSV to JSON in Python
How to count the number of elements in Django and output to a template
Scraping tabelog with python and outputting to CSV
Converting the coordinate system to ECEF and geodesy
[Linux] I learned LPIC lv1 in 10 days and tried to understand the mechanism of Linux.
Cisco IOS-XE captures changes in the routing table and posts them to external services