[PYTHON] Understand in 10 minutes Selenium

What is Selenium?

Selenium is a framework for automating web browser operations. Developed by ThoughtWorks in 2004 to automate UI testing of web applications. https://selenium.dev/history/

Originally developed for the purpose of UI testing and JavaScript testing of web applications, it is used for various purposes other than testing, such as task automation and website crawling.

This article describes how to build an environment and use it in Python to operate Chrome via Selenium.

TL;DR --Environment building is super easy with the official Docker image --Write code to automate how to operate the browser --Can also be used as a crawler

Environment

To operate the browser automatically using Selenium, you need to install the following.

--Web browser --Chrome, Firefox, IE, Opera, etc.

Here, we will introduce two types of environment construction when using Selenium with Python, one is to use Docker and the other is to create an environment directly on a local PC.

(Method 1) Build an environment with Doceker

It's very easy to set up using the Docker image officially published by Selenium. https://github.com/SeleniumHQ/docker-selenium

This method has such a configuration.

The browser and Remote WebDriver run on top of the Docker container, and Selenium connects to the Remote WebDriver over the network from another host.

Personally, this method is the easiest to set up and is the most recommended.

Install Chrome and WebDriver

Simply execute the following command to start the Chrome environment that can be operated from Selenium.

$ docker run -d -p 4444:4444 -v /dev/shm:/dev/shm selenium/standalone-chrome:3.141.59-xenon

WebDriver has a slightly annoying problem that it doesn't work unless you choose the version that matches your browser version, but the official Docker image has both the browser and WebDriver installed so you can use it right away.

Install Python Selenium bindings

Install the library for using Selenium on the machine that runs Selenium's Python code. Python's Selenium bindings can be installed with pip.

$ pip install selenium

Try to move

You can run Selenium from Python with code like this:

from selenium import webdriver

#Set Chrome options
options = webdriver.ChromeOptions()
options.add_argument('--headless')

#Connect to Selenium Server
driver = webdriver.Remote(
    command_executor='http://localhost:4444/wd/hub',
    desired_capabilities=options.to_capabilities(),
    options=options,
)

#Operate the browser via Selenium
driver.get('https://qiita.com')
print(driver.current_url)

#Quit the browser
driver.quit()

(Method 2) Build an environment locally

Next, I will write how to build an environment for running Selenium locally on the Mac.

This method has such a configuration.

The browser and WebDriver are all running locally, and Selenium connects to the local Driver.

Install chrome

Many people may already have Chrome installed, but install Chrome normally.

Then check the version of Chrome installed to decide which version of WebDirver to install. In my environment, 78.0.3904.108 was installed.

image.png

Install WebDriver

Download the Chrome WebDriver binary. In Python, there is a person who has published a convenient library called chromedriver-binary that downloads the WebDriver binary and sets the path, so I will use this.

https://github.com/danielkaiser/python-chromedriver-binary

Since it is necessary to install WebDriver corresponding to the version of Chrome, install WebDriver by specifying only the major version with pip as follows.

$ pip install chromedriver-binary==78.*

Install Python Selenium bindings

Python's Selenium bindings are installed with pip.

$ pip install selenium

Try to move

In the local environment, you can run Selenium with the following code.

import chromedriver_binary # nopa
from selenium import webdriver

#Set WebDriver options
options = webdriver.ChromeOptions()
options.add_argument('--headless')

print('connectiong to remote browser...')
driver = webdriver.Chrome(options=options)

driver.get('https://qiita.com')
print(driver.current_url)

#Quit the browser
driver.quit()

Compared to the previous Docker example, the difference is that the Chrome class is specified for WebDriver.

Running the above code will launch Chrome on your PC. If you comment out the ʻoptions.add_argument ('--headless')` part, you can see how the browser screen is displayed and moving.

Basic usage

Now that you can build an environment for using Selenium, let's see how to actually use it.

Thing you want to do

Now, let's run Chrome on Selenium and try the following operations.

  1. Visit Qiita's Chanmoro profile page https://qiita.com/Chanmoro

  2. Go to the second page of the article list displayed in "Recent Articles"

  3. Get the URL and display the title of the article displayed at the very beginning of the second page

Run Selenium in Python

Here, we will use Selenium Server with Docker introduced in the environment construction at the beginning. Even in a local environment, the only difference is the part that sets up WebDrive, and the same code can be used for subsequent operations.

First of all, I will show you the whole code.

from selenium import webdriver
from selenium.webdriver.common.by import By

# x.Set Chrome launch options
options = webdriver.ChromeOptions()
options.add_argument('--headless')

# x.Open a new browser window
print('connectiong to remote browser...')
driver = webdriver.Remote(
    command_executor='http://localhost:4444/wd/hub',
    desired_capabilities=options.to_capabilities(),
    options=options,
)

# 1.Visit Qiita's Chanmoro profile page
driver.get('https://qiita.com/Chanmoro')
print(driver.current_url)
# > https://qiita.com/Chanmoro

# 2.Go to the second page of the article list displayed in "Recent Articles"
driver.find_element(By.XPATH, '//a[@rel="next" and text()="2"]').click()
print(driver.current_url)
# > https://qiita.com/Chanmoro?page=2

# 3.Get the URL for the title of the article displayed at the very beginning of the second page
article_links = driver.find_elements(By.XPATH, '//div[@class="ItemLink__title"]/a')
print(article_links[0].text)
# > Python -Dynamically call a function from a string
print(article_links[0].get_attribute('href'))
# > https://qiita.com/Chanmoro/items/9b0105e4c18bb76ed4e9

# x.Quit the browser
driver.quit()

Let's explain step by step.

x. Set Chrome options

First, set Chrome startup options before launching Chrome. Optional classes are separate for each browser, and there are browser-compatible classes such as ChromeOptions for Chrome and FirefoxOptions for Firefox.

In Chrome, the headless option launches the browser without displaying the screen. Basically, I think that it is mostly operated in headless mode, but if you want to visually check how the screen is operated during debugging etc., you can also use it without specifying this option.

options = webdriver.ChromeOptions()
options.add_argument('--headless')

x. Open a new browser window

Then open a new window from Selenium. If the browser is not running at this time, it will be started.

If you are using Selenium Server as introduced in the environment construction at the beginning, use the Remote class and specify the browser type with the argument of desired_capabilities. In this example, ʻoptions contains a ChromeOptions` object, so we will specify Chrome.

# NOTE:To run Selenium remotely, specify the Remote WebDriver as follows:
driver = webdriver.Remote(
    command_executor='http://localhost:4444/wd/hub',
    desired_capabilities=options.to_capabilities(),
    options=options,
)

1. Visit Qiita's Chanmoro profile page

Access the URL specified by calling the get () method of the WebDriver object.

driver.get('https://qiita.com/Chanmoro')

You can access the URL currently displayed by the window with current_url and the HTML displayed with page_source.

print(driver.current_url)
print(driver.page_source)

2. Go to the second page of the article list displayed in "Recent Articles"

Click the link to go to the second page displayed at the bottom of "Recent Articles" on your profile screen to go to the page.

image.png

In Selenium, get the target element with find_element as shown below, and click by callingclick ()for that element. Here, the a tag that has the attribute of rel =" next " and contains the character string of 2 is specified and clicked.

driver.find_element(By.XPATH, '//a[@rel="next" and text()="2"]').click()

In addition to XPath, Selenium allows you to specify the target element in various ways such as CSS selector, ID specification, name attribute specification, class specification, and so on. Personally, I like XPath the most because it can powerfully kill any element in one shot. It's a little tricky for the first moment until I get used to writing.

In this example, the XPath is written by specifying By.XPATH infind_element (), but the same thing can be done by specifying XPath usingfind_element_by_xpath ().

Check this document for details on the methods for specifying elements. https://selenium-python.readthedocs.io/locating-elements.html

3. Get the URL for the title of the article displayed at the very beginning of the second page

This time, we will get multiple elements specified by find_elements (). The find_element () used at the time of the click returns only the first element that matches the specified condition, but find_elements () returns an array of elements even if there are multiple matches.

On the profile page, ʻItemLink__title` is given to the class of the element of the article title in the list, so I relied on that to get the list of article titles.

article_links = driver.find_elements(By.XPATH, '//div[@class="ItemLink__title"]/a')

You can get the text written in the tag with text for the retrieved element, and you can get attributes such as href with get_attribute ().

print(article_links[0].text)
print(article_links[0].get_attribute('href'))

x. Exit the browser

Finally, when the process is complete, call quit () to exit the browser.

driver.quit()

If you forget to call quit () when an error occurs during Selenium processing, it will lead to a bug that the browser will stay running and memory consumption will increase steadily, so try- Make sure to handle the error with catch etc. and always call quit () at the end of the program.

(Tips) Make a crawler with Selenium

Selenium can also be used for crawler applications. Use Selenium to perform JavaScript drawing and button click operations, and use some HTML parser for the displayed page to get the elements.

You can use the HTML parser provided by Selenium itself for parsing, but it is recommended to use a library such as BeautifulSoup because it has useful functions for parsing and can be implemented more flexibly.

Specifically, as shown in the code below, the HTML obtained by driver.page_source is parsed and the data is obtained.

from bs4 import BeautifulSoup
from selenium import webdriver


options = webdriver.ChromeOptions()
options.add_argument('--headless')

driver = webdriver.Remote(
    command_executor='http://localhost:4444/wd/hub',
    desired_capabilities=options.to_capabilities(),
    options=options,
)

driver.get('https://qiita.com/Chanmoro')

#Create a BeautifulSoup object from the HTML displayed in the browser and parse it
soup = BeautifulSoup(driver.page_source, 'html.parser')

articles = soup.select('.ItemLink')
for article in soup.select('.ItemLink'):
    #Get a list of article titles displayed on your profile page
    print(article.select_one('.ItemLink__title a').get_text())

driver.quit()

For more information on how to use BeautifulSoup, please read the wonderful article "Beautiful Soup in 10 Minutes"! (annoying) https://qiita.com/Chanmoro/items/db51658b073acddea4ac

Summary

By the way, in this article, I introduced the environment setting and basic usage of Selenium. It's very easy to build an environment using Docker, but it's not too difficult to build an environment on a local PC, so you can try either method immediately.

Selenium can be used to automate UI testing and browser operations, as well as to crawl sites like SPA, which are rendered in JavaScript, as I mentioned at the end.

Also, for UI testing purposes, there is a very useful mechanism called Selenium Grid that allows you to run tests in multiple browsers at the same time. Selenium Grid is a great mechanism that allows you to pool multiple types of browsers such as Chrome, Firefox, IE and multiple versions of browsers and run the same test in parallel on those multiple browsers via Hub. is.

Selenium Grid also makes it easy to create an environment using docker, so please check the Selenium Grid documentation and docker-selenium README for details. https://selenium.dev/documentation/en/grid/ https://github.com/SeleniumHQ/docker-selenium

In writing this article, I read History of Selenium and learned that ThoughtWorks engineers first created the concept and core functionality of Selenium. Personally, I feel that it's suddenly cool just because it was made by a person from ThoughtWorks.

There is also a very useful browser extension called Selenium IDE that allows you to manually move the browser to record and play back its operations, which is Japan. It seems that a person Shinya Kasatani (@shinya) was developed.

Selenium should have a revolutionary impact on web application development, and it's really amazing that such software was created and released as OSS and is widely used all over the world.

Now, let's all understand how to use Selenium, stand on the shoulders of giants, and have a fun browser automation life today!

Recommended Posts

Understand in 10 minutes Selenium
Selenium running in 15 minutes
[Python] Pandas to fully understand in 10 minutes
Learn Pandas in 10 minutes
Scraping with selenium in Python
Scraping with Selenium in Python
I understand Python in Japanese!
Start in 5 minutes GIMP Python-Fu
Scraping with Selenium in Python (Basic)
Let's experience BERT in about 30 minutes.
Scraping with Beautiful Soup in 10 minutes
Write selenium test code in python
Understand Kullback-Leibler spoken in generative models
Make matplotlib Japanese compatible in 3 minutes
Understand Cog and Extension in discord.py
Deploy Django in 3 minutes using docker-compose
[Understanding in 3 minutes] The beginning of Linux
Implement and understand union-find trees in Go
I can't get the element in Selenium!
Django Foreign Key Tutorial Ends in 10 Minutes
Screenshots of Megalodon in selenium and Chrome.
Get Cloud Logging available in Python in 10 minutes
[Python] Understand list slicing operations in seconds