Scraping with Python-Selenium is old! ?? ・ ・ ・ How to use Pyppeteer

How to use Pyppeteer

table of contents

-[What is Pyppeteer](What is #pyppeteer) -Install -[How to use](# How to use) -[Launch browser and open site](# Launch browser and open site) -[Get Element](# Get Element) --[Enter in text box](# Enter in text box) --Click -[Get attribute value](# Get attribute value) --[Get innerHTML](Get #innerhtml)

What is Pyppeteer

Pyppeteer is a Python package for operating the Chrome browser, which is a port of the library Puppeteer for node.js to Python. Since there is little information on Pyppereer, it may be better to search with Puppeteer when searching.

Pyppeteer author's blog Puppeteer site


pip install pyppeteer

How to use

Launch your browser and open your site

import asyncio

from pyppeteer import launch

async def main():
    browser = await launch()
    page = await browser.newPage()
    await page.goto('')

if __name__ == "__main__":

Executing the above command will start Chronium in headless mode, open the Google site and close the browser. Chromium will be installed only once during the first import. By default, Pyppeteer launches the browser in headless mode.

browser = await launch(headless=False)

You can display the browser by doing.

Get element

#Get the first element that meets the criteria
# page.Also possible with J
textbox = await page.querySelector('input[aria-label="Search"]')
textbox = await page.J('input[aria-label="Search"]')

#Get all the elements that meet the conditions
# page.Also possible with JJ
buttons = await page.querySelectorAll('input[aria-label="Google search"]')
buttons = await page.JJ('input[aria-label="Google search"]')

Type in the text box

# page.type(selector,Input value)
#Possible even from the acquired element
await page.type('input[aria-label="Search"]', 'pyppeteer')
await textbox.type('pyppeteer')


#It is possible even from the acquired element

#Because Google's search button has to select the second element
#Should I get it with querySelectorAll?
await'last-child:input[aria-label="Google search"]')
await buttons[1].click()

Get the value of an attribute

Get it by executing javascript with page.evaluate.

text = await page.evaluate('elm => elm.getAttribute("name")',textbox)

Get innerHTML

Same as above

elm = await page.J('#hptl')
text = await page.evaluate('elm => elm.innerHTML', elm)

Recommended Posts

Scraping with Python-Selenium is old! ?? ・ ・ ・ How to use Pyppeteer
Python: How to use async with
How to use virtualenv with PowerShell
How to use FTP with Python
[Pandas] What is set_option [How to use]
How to use OpenVPN with Ubuntu 18.04.3 LTS
How to use Cmder with PyCharm (Windows)
How to use Ass / Alembic with HtoA
How to use Japanese with NLTK plot
How to use jupyter notebook with ABCI
How to use CUT command (with sample)
How to use is and == in Python
How to use SQLAlchemy / Connect with aiomysql
How to use JDBC driver with Redash
How to use GCP trace with open Telemetry
How to use tkinter with python in pyenv
How to use xml.etree.ElementTree
How to use Python-shell
How to use
How to use virtualenv
Scraping 2 How to scrape
How to use Seaboan
How to use image-match
How to use shogun
How to use Pandas 2
How to use Virtualenv
How to use numpy.vectorize
How to use partial
How to use Bio.Phylo
How to use SymPy
How to use
How to use IPython
How to use virtualenv
How to use Matplotlib
How to use iptables
How to use numpy
How to use TokyoTechFes2015
How to use venv
How to use dictionary {}
How to use list []
How to use OptParse
How to use return
How to use dotenv
How to use pyenv-virtualenv
How to use Go.mod
How to use imutils
How to use import
How to use xgboost: Multi-class classification with iris data
How to use python interactive mode with git bash
How to use Qt Designer
[gensim] How to use Doc2Vec
python3: How to use bottle (2)
Understand how to use django-filter
How to use the generator
[Python] How to use list 1
How to use FastAPI ③ OpenAPI
How to use Python argparse
How to use IPython Notebook
How to update with SQLAlchemy?
[Note] How to use virtualenv
How to use redis-py Dictionaries