[PYTHON] Scraping with selenium

Scraping dynamically written sites with selenium

If it is a site written in JS etc., you may not be able to scrape it with Beautiful Soup. Selenium can be used in such cases.

Get chrome driver

First check the version of chrome.

(For Mac)

  1. With chrome open, click "chrome" at the top left of the screen
  2. Click "About Google chrome"
  3. A page called "Settings-About Chrome" will open, and it will be displayed there. Version: 8? Check the part that says. ~ ~ ~ ~.

Get the Chrome Driver from the download page.

On the Download Page (https://chromedriver.chromium.org/downloads),

From the following part, download the chrome driver that matches the version examined above. (Select the OS at the page link destination.) スクリーンショット 2020-09-22 17.43.04.png

How to use

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time

url="~~~~~~"#URL you want to open here
options = Options()
options.add_argument('--headless') #Enable headless mode
Driver_path="~~~~~~" #Specify the location where you put the downloaded chrome driver
driver = webdriver.Chrome(Driver_path,options=options)
driver.get(url)
time.sleep(2)
html = driver.page_source.encode('utf-8')
soup = BeautifulSoup(html, 'lxml')
#After this, you can use it normally according to the grammar of Beautiful Soup.

By adding option, it is prevented that the page is opened every time driver.get is executed. (This will speed up the process a bit.)

reference

Three settings to make for stable operation of Selenium (also supports Headless mode)

Recommended Posts

Scraping with selenium
Scraping with selenium ~ 2 ~
Scraping with Selenium
Successful scraping with Selenium
Scraping with Selenium [Python]
Scraping with selenium in Python
Scraping with Selenium + Python Part 1
Scraping with Selenium in Python
Scraping with Selenium + Python Part 2
I-town page scraping with selenium
Scraping with Python
Scraping with Python
Scraping with Selenium in Python (Basic)
Scraping with Python, Selenium and Chromedriver
Beginning with Selenium
Practice web scraping with Python and Selenium
Scraping with Python (preparation)
Try scraping with Python.
Scraping with Python + PhantomJS
Scraping with scrapy shell
ScreenShot with Selenium (Python)
Python web scraping selenium
Scraping with Python + PyQuery
Scraping with Beautiful Soup
Scraping RSS with Python
Serverless scraping using selenium with [AWS Lambda] -Part 1-
Web scraping with python + JupyterLab
Scraping with chromedriver in python
Festive scraping with Python, scrapy
Save images with web scraping
Python: Working with Firefox with selenium
selenium
I was addicted to scraping with Selenium (+ Python) in 2020
Easy web scraping with Scrapy
Scraping with Tor in Python
Web scraping using Selenium (Python)
Scraping weather forecast with python
Memories of fighting with Selenium
scraping the Nikkei 225 with playwright-python
Try Selenium Grid with Docker
[Python + Selenium] Tips for scraping
I tried scraping with python
Web scraping beginner with python
Scraping 1
Table scraping with Beautiful Soup
Try scraping with Python + Beautiful Soup
Scraping multiple pages with Beautiful Soup
Scraping with Node, Ruby and Python
Web scraping with Python ① (Scraping prior knowledge)
Web scraping with BeautifulSoup4 (layered page)
Scraping Alexa's web rank with pyQuery
Summary of scraping relations (selenium, pyautogui)
Web scraping with Python First step
I tried web scraping with python.
Scraping with Python and Beautiful Soup
Scraping pages with pagination with Beautiful Soup
Scraping with Beautiful Soup in 10 minutes
Make testing with Selenium more accessible
Let's do image scraping with Python
Get Qiita trends with Python scraping
"Scraping & machine learning with Python" Learning memo