[python, ruby] fetch the contents of a web page with selenium-webdriver

At the beginning

I wrote it in Ruby and Python, but since I decided to write it in Python in the second half, Ruby became only a simple part ... Please note that the Ruby part is an additional note.

python

Installation

selenium

pip install selenium

chromewebdriver Because it was a mac

brew install chromedriver

Linux is below? (I don't know because I haven't tried it ...)

sudo apt-get install chromium-browser

Referenced page http://stackoverflow.com/questions/8255929/running-webdriver-chrome-with-selenium

Simple code

A simple example of accessing the Google homepage, waiting 10 seconds and closing

sample.py


from selenium import webdriver
from time import sleep
browser = webdriver.Chrome()
browser.get('http://google.com')
sleep(10)
browser.close()

Login relationship

login.py


#Find the part where id is email
mail = browser.find_element_by_id('email')
#Find the part where id is pass
pass_wd = browser.find_element_by_id('pass')
#Enter email
mail.send_keys('[email protected]')
#Enter pass
pass_wd.send_keys('password')
#Send
pass_wd.submit()

Set Allow / Block of Notification of chrome

In the case of Ruby, it is okay to leave it as it is, but in the case of Python, this PopUp makes it impossible to execute the program, so set chrome_options in advance.

Change before


browser = webdriver.Chrome()

After change


chrome_options = webdriver.ChromeOptions()
prefs = {"profile.default_content_setting_values.notifications" : 2}
chrome_options.add_experimental_option("prefs",prefs)
browser = webdriver.Chrome(chrome_options=chrome_options)

scroll

Scroll to the top of the page

browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")

Get the link

All Links of the Element you just specified

links = myelement.find_elements_by_xpath(".//a")

All links in the page now

links = myelement.find_elements_by_xpath("//a")

If you get a Link with any of the above information, use get_attribute ('href') to get the URL

urls = [ link.get_attribute('href') for link in links]

Referenced page

http://www.takunoko.com/blog/pythonselenium%E3%81%A7twitter%E3%81%AB%E3%83%AD%E3%82%B0%E3%82%A4%E3%83%B3%E3%81%97%E3%81%A6%E3%81%BF%E3%82%8B/ Easy login

http://selenium-python.readthedocs.io/faq.html --Scroll --take a link

ruby

Installation

selenium-webdriver gem

gem install selenium-webdriver

chrome driver

After downloading and unzipping the chromedriver Check the location of ruby with which ruby and move to it

If you are using rbenv, you can use the following command mv chromedriver ~/.rbenv/shims

Simple code

require "selenium-webdriver"

driver = Selenium::WebDriver.for :chrome
driver.navigate.to "http://google.com"

driver.quit

Login relationship

## type email
element = driver.find_element(:id, 'email')
element.send_keys '[email protected]'
# type password
element = driver.find_element(:id, 'pass')
element.send_keys 'password'
# submit the form
element.submit

Now you can get the screen after logging in.

Referenced page

http://shoprev.hatenablog.com/entry/2014/04/14/210529 See ChromeDriver settings and simple code section

https://gist.github.com/huangzhichong/3284966 See here for details yesterday

Recommended Posts

[python, ruby] fetch the contents of a web page with selenium-webdriver
[python] Quickly fetch web page metadata with lassie
Extract data from a web page with Python
Get a capture of the entire web page in Selenium Python VBA
[Introduction to Python] How to sort the contents of a list efficiently with list sort
Hit a method of a class instance with the Python Bottle Web API
Creating BINGO "Web Tools" with Python (Table of Contents)
[Python3] Take a screenshot of a web page on the server and crop it further
[For beginners] Web scraping with Python "Access the URL in the page to get the contents"
[Part.2] Crawling with Python! Click the web page to move!
A Python script that compares the contents of two directories
Get the source of the page to load infinitely with python.
I compared the speed of Hash with Topaz, Ruby and Python
Process the contents of the file in order with a shell script
Save the result of the life game as a gif with python
Output the contents of ~ .xlsx in the folder to HTML with Python
The story of making a standard driver for db with python.
The idea of feeding the config file with a python file instead of yaml
The story of making a module that skips mail with python
[Python] A program that rotates the contents of the list to the left
Create a compatibility judgment program with the random module of python.
Check the existence of the file with python
Search the maze with the python A * algorithm
Daemonize a Python web app with Supervisor
[Personal note] Web page scraping with python3
Download files on the web with Python
[Python] A quick web application with Bottle!
[python] [meta] Is the type of python a type?
Run a Python web application with Docker
Let's make a web framework with Python! (1)
The story of blackjack A processing (python)
Easy web scraping with Python and Ruby
Let's make a web framework with Python! (2)
The story of making a university 100 yen breakfast LINE bot with Python
[AtCoder explanation] Control the A, B, C problems of ABC182 with Python!
Calculate the shortest route of a graph with Dijkstra's algorithm and Python
Get the number of searches with a regular expression. SeleniumBasic VBA Python
[AtCoder explanation] Control the A, B, C problems of ABC186 with Python!
[AtCoder explanation] Control the A, B, C problems of ABC185 with Python!
Calculate the probability of being a squid coin with Bayes' theorem [python]
Receive a list of the results of parallel processing in Python with starmap
[AtCoder explanation] Control the A, B, C problems of ABC187 with Python!
[AtCoder explanation] Control the A, B, C problems of ABC184 with Python!
[AtCoder] Solve A problem of ABC101 ~ 169 with Python
Get the contents of git diff from python
[Python] Get the files in a folder with Python
Prepare the execution environment of Python3 with Docker
2016 The University of Tokyo Mathematics Solved with Python
The contents of the Python tutorial (Chapter 5) are itemized.
The contents of the Python tutorial (Chapter 4) are itemized.
The contents of the Python tutorial (Chapter 2) are itemized.
Color page judgment of scanned image with python
[Note] Export the html of the site with python.
Get the caller of a function in Python
The contents of the Python tutorial (Chapter 8) are itemized.
Calculate the total number of combinations with python
The contents of the Python tutorial (Chapter 1) are itemized.
Specifying the range of ruby and python arrays
Create a page that loads infinitely with python
Start a simple Python web server with Docker
Make a copy of the list in Python