[PYTHON] Scraping the usage history of the community cycle

Introduction

Personally, I often use the (Tokyo) Chuo-ku Community Cycle. Currently, Bicycle sharing wide area experiment is underway, and 4 wards (Chuo-ku, Chiyoda-ku, Minato-ku) ・ It is a very convenient rental cycle that allows you to freely go back and forth between (Koto Ward) and return it to a cycle port different from the one you rented.

What I tried

You can check "when, where, where did you get on?" On the Web, but I wanted to automate this, so I made it with Python3 + BeautifulSoup4 + Selenium + Firefox. It has been confirmed to work in a Windows environment. (For the time being, I registered in Chuo Ward, so it is a script for Chuo Ward registrants)

docomo-cycle.py


#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import urllib.request
from bs4 import BeautifulSoup
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.by import By
import csv

MEMBERID = "(My user ID)"
PASSWORD = "(My password)"

driver = webdriver.Firefox()
driver.get("https://tcc.docomo-cycle.jp/cycle/TYO/cs_web_main.php?AreaID=2")

mid = driver.find_element_by_name('MemberID')
mid.send_keys(MEMBERID)
password = driver.find_element_by_name('Password')
password.send_keys(PASSWORD)
password.send_keys(Keys.RETURN)

obj1 = WebDriverWait(driver,5).until(
    EC.presence_of_element_located((By.PARTIAL_LINK_TEXT, "Billing")))
obj1.click()

time.sleep(3)

data = driver.page_source.encode('utf-8')
soup = BeautifulSoup(data, "html.parser")

table = soup.findAll("table",{"class":"rnt_ref_table"})[0]
rows = table.findAll("tr")

csvFile = open("docomo-cycle.csv", 'wt', newline='', encoding='utf-8')
writer = csv.writer(csvFile)
try:
  for row in rows:
    csvRow = []
    for cell in row.findAll(['td', 'th']):
      csvRow.append(cell.get_text().replace('\n',''))
    writer.writerow(csvRow)
finally:
  csvFile.close()

driver.close()

When the above script is executed, the usage history will be output as docomo-cycle.csv. It is like this.

docomo-cycle.csv


1,2016/5/2 07:22,B3-01.Chuo Ward Office B3-01.Chuo City Office,→,2016/5/2 07:35,A3-02.Casa Nova Shop (Kaede Building) A3-02.CASA NOUVA SHOP(Kaede Building)
2,2016/5/2 18:29,A3-02.Casa Nova Shop (Kaede Building) A3-02.CASA NOUVA SHOP(Kaede Building),→,2016/5/2 18:50,B4-03.Sakura no Sanpomchi (in front of Harumi Triton Square) B4-03.Sakura no sanpomichi(In front of Harumi Triton Square)
3,2016/5/5 21:32,B3-03.Ginza 6-chome-SQUARE (Kobikicho-dori) B3-03.Ginza 6-chome SQUARE(Kobikicho Dori),→,2016/5/5 21:48,B4-03.Sakura no Sanpomchi (in front of Harumi Triton Square) B4-03.Sakura no sanpomichi(In front of Harumi Triton Square)
4,2016/5/6 07:28,B4-03.Sakura no Sanpomchi (in front of Harumi Triton Square) B4-03.Sakura no sanpomichi(In front of Harumi Triton Square),→,2016/5/6 07:41,B2-02.Yanagi-dori (in front of Tokyo Square Garden) B2-02.Yanagi-dori St. (In front of TOKYO SQUARE GARDEN)
5,2016/5/8 05:00,B4-03.Sakura no Sanpomchi (in front of Harumi Triton Square) B4-03.Sakura no sanpomichi(In front of Harumi Triton Square),→,2016/5/8 05:08,H1-02.Toyosu Station H1-02.Toyosu Station
6,2016/5/9 07:25,B4-03.Sakura no Sanpomchi (in front of Harumi Triton Square) B4-03.Sakura no sanpomichi(In front of Harumi Triton Square),→,2016/5/9 07:48,A3-02.Casa Nova Shop (Kaede Building) A3-02.CASA NOUVA SHOP(Kaede Building)
7,2016/5/10 08:18,B4-03.Sakura no Sanpomchi (in front of Harumi Triton Square) B4-03.Sakura no sanpomichi(In front of Harumi Triton Square),→,2016/5/10 08:40,A3-02.Casa Nova Shop (Kaede Building) A3-02.CASA NOUVA SHOP(Kaede Building)

Future

From the viewpoint of "automatic", it is useless to start Firefox and change the screen, so I would like to use PhantomJS or something to make it operate silently. Then, scraping on time with cron, or automatically adding to Google spreadsheets. .. ..

reference

"Web scraping with Python" O'Reilly Japan, ISBN978-4-87311-761-4 -For CSV file conversion, I referred to this.

Serpentine

Even in a community cycle other than Chuo-ku, Tokyo, it seems that it is managed by the same system, so you may be able to do the same by rewriting the URL of the ** driver.get ** line with the following contents.

・ Koto Ward  https://tcc.docomo-cycle.jp/cycle/TYO/cs_web_main.php?AreaID=4 ・ Chiyoda Ward  https://tcc.docomo-cycle.jp/cycle/TYO/cs_web_main.php?AreaID=1 ·Minato-ku  https://tcc.docomo-cycle.jp/cycle/TYO/cs_web_main.php?AreaID=3 ·Yokohama  https://tcc.docomo-cycle.jp/cycle/YKH/cs_web_main.php ・ Sendai  https://tcc.docomo-cycle.jp/cycle/SND/cs_web_main.php ·Hiroshima  https://tcc.docomo-cycle.jp/cycle/HRS/cs_web_main.php ・ Kanagawa Prefecture West  https://tcc.docomo-cycle.jp/cycle/KNS/cs_web_main.php ・ Koshu  https://tcc.docomo-cycle.jp/cycle/KSH/cs_web_main.php ・ Kobe  https://tcc.docomo-cycle.jp/cycle/kob/cs_web_main.php

Recommended Posts

Scraping the usage history of the community cycle
Scraping the usage history of the community cycle PhantomJS version
Scraping the result of "Schedule-kun"
The usage of TensorBoard has changed slightly
Visualized the usage status of the sink in the company
Organize the super-basic usage of Autotools and pkg-config
Scraping the winning data of Numbers using Docker
Python --Explanation and usage summary of the top 24 packages
Since there are many earthquakes, get the history of earthquakes
Make a note of the list of basic Pandas usage
The definitive edition of python scraping! (Target site: BicCamera)
I tried scraping the advertisement of the pirated cartoon site
[Introduction to Python] Basic usage of the library matplotlib
Roughly estimate the total memory usage of an object
The beginning of cif2cell
Basics of Python scraping basics
Summary of pyenv usage
Basic usage of Jinja2
Usage of Python locals ()
the zen of Python
Basic usage of SQLAlchemy
The story of sys.path.append ()
Revenge of the Types: Revenge of types
Try scraping the data of COVID-19 in Tokyo with Python
Get the id of a GPU with low memory usage
Scraping the Excel file of the list of stores handling regional coupons
Scraping member images from the official website of Sakamichi Group
Create a correlation diagram from the conversation history of twitter