https://yokonoji.work/python-scraping-6 https://qiita.com/akabei/items/0eac37cb852ad476c6b9
requests BeautifulSoup oauth2client gspread
The service account key for accessing Google Sheets is as written on the reference site, so write it roughly. m (_ _) m
↓ At the site https://console.developers.google.com/cloud-resource-manager
↓ execute --Create a project --Google Drive API enabled --Google Sheets API enabled --Create a service account key (JSON download)
--Spreadsheet creation --From "Share", share the address of "client_email" written in the downloaded JSON
sample.py
import requests
import gspread
from bs4 import BeautifulSoup
from oauth2client.service_account import ServiceAccountCredentials
url = "<URL of the site to get>"
r = requests.get(url)
soup = BeautifulSoup(r.text, 'lxml')
elements = soup.select('<Tags you want to get>')#select()Since it uses a method, it can be written with "CSS selector"
scope = ['https://spreadsheets.google.com/feeds',
'https://www.googleapis.com/auth/drive']
credentials = ServiceAccountCredentials.from_json_keyfile_name('<Downloaded JSON file name>', scope)
gc = gspread.authorize(credentials)
wks = gc.open('<Spreadsheet name>').sheet1
for index, e in enumerate(elements):
num = index + 1 #Since there is no "0" in the spreadsheet number, add 1 first
wks.update_acell('A'+str(num) , e.get_text())
Recommended Posts