[PYTHON] You will be an engineer in 100 days ――Day 74 ――Programming ――About scraping 5

Click here until yesterday

You will become an engineer in 100 days-Day 70-Programming-About scraping

You will become an engineer in 100 days --Day 66 --Programming --About natural language processing

You will become an engineer in 100 days --Day 63 --Programming --Probability 1

You become an engineer in 100 days --Day 59 --Programming --Algorithm

You will become an engineer in 100 days --- Day 53 --Git --About Git

You will become an engineer in 100 days --Day 42 --Cloud --About cloud services

You will become an engineer in 100 days --Day 36 --Database --About the database

You will be an engineer in 100 days --Day 24 --Python --Basics of Python language 1

You will become an engineer in 100 days --Day 18 --Javascript --JavaScript basics 1

You will become an engineer in 100 days --Day 14 --CSS --CSS Basics 1

You will become an engineer in 100 days --Day 6 --HTML --HTML basics 1

This time is also a continuation of scraping.

Arakata The principle of scraping has ended up to the last time. Today is the story of Selenium.

About Selenium

Selenium is framework software for automating the operation of web browsers.

By using Selenium, it is done by the Python requests library alone. You will be able to obtain information that cannot be obtained by scraping.

So what is the information that cannot be obtained?

In the normal requests library, the information that can be obtained by the get method etc. is the HTML source.

If some of that element is written to render in Javascript If Javascript does not work, it will not be reflected as data.

Therefore, the elements dynamically generated by Javascript are in the requests library. It cannot be obtained.

Since Selenium runs a web browser to get data, it is no different from accessing with a normal browser. Javascript also works and you can get the rendered data.

What you need to run Selenium

The following three are required to run Selenium on a PC.

** WEB browser ** Chrome, Firefox, Opera, etc.

WebDriver Software for operating the browser

Selenium A library that operates the browser programmatically in cooperation with WebDriver

Installation of various tools

The installation method is as follows.

** Installing a web browser ** Download from the download site of various browsers and install

Google Chrome

Firefox

Opera

** Download WebDriver ** WebDriver does not need to be installed, just download and deploy it. After downloading, place it in a directory close to the program.

The driver will change as the browser version is upgraded, so download it according to the version each time.

Google Chrome

Firefox

Opera

** Install Selenium ** The installation method in Python is as follows.

pip install selenium

Run Selenium

As a procedure to move Selenium

  1. Browser installation
  2. Download and deploy WebDriver
  3. Install Selenium is.

Here, let's operate Google Chrome from Selenium.

from selenium import webdriver

#Driver settings
chromedriver = "Driver's full pass"
driver = webdriver.Chrome(executable_path=chromedriver)

driver.get('URL of access destination')

Doing this will launch your browser.

Since the browser to launch is Google Chrome, I am using webdriver.Chrome. The corresponding method changes depending on the browser. Firefox:webdriver.Firefox Opera:webdriver.Opera

ʻI write the path of WebDriver in executable_path` It doesn't seem to recognize it unless it is a full path (absolute path). Let's put the webdriver in a shallow hierarchy.

Have you been able to launch your browser using Selenium so far?

Next time, I will start how to operate the browser from here.

Summary

With selenium, with normal scraping techniques It is convenient because you can easily obtain information that cannot be obtained.

If you are having trouble getting data, try selenium.

26 days until you become an engineer

Author information

Otsu py's HP: http://www.otupy.net/

Youtube: https://www.youtube.com/channel/UCaT7xpeq8n1G_HcJKKSOXMw

Twitter: https://twitter.com/otupython

Recommended Posts

You will be an engineer in 100 days ――Day 74 ――Programming ――About scraping 5
You will be an engineer in 100 days ――Day 73 ――Programming ――About scraping 4
You will be an engineer in 100 days ――Day 75 ――Programming ――About scraping 6
You will be an engineer in 100 days ――Day 70 ――Programming ――About scraping
You will be an engineer in 100 days ――Day 61 ――Programming ――About exploration
You will be an engineer in 100 days --Day 68 --Programming --About TF-IDF
You will be an engineer in 100 days ――Day 81 ――Programming ――About machine learning 6
You will be an engineer in 100 days ――Day 82 ――Programming ――About machine learning 7
You will be an engineer in 100 days ――Day 79 ――Programming ――About machine learning 4
You will be an engineer in 100 days ――Day 76 ――Programming ――About machine learning
You will be an engineer in 100 days ――Day 80 ――Programming ――About machine learning 5
You will be an engineer in 100 days ――Day 78 ――Programming ――About machine learning 3
You will be an engineer in 100 days ――Day 84 ――Programming ――About machine learning 9
You will be an engineer in 100 days ――Day 83 ――Programming ――About machine learning 8
You will be an engineer in 100 days ――Day 77 ――Programming ――About machine learning 2
You will be an engineer in 100 days ――Day 85 ――Programming ――About machine learning 10
You will be an engineer in 100 days --Day 63 --Programming --Probability 1
You will be an engineer in 100 days --Day 65 --Programming --Probability 3
You will be an engineer in 100 days --Day 64 --Programming --Probability 2
You will be an engineer in 100 days --Day 86 --Database --About Hadoop
You will be an engineer in 100 days ――Day 60 ――Programming ――About data structure and sorting algorithm
You will be an engineer in 100 days --Day 27 --Python --Python Exercise 1
You will be an engineer in 100 days --Day 31 --Python --Python Exercise 2
You become an engineer in 100 days ――Day 67 ――Programming ――About morphological analysis
You become an engineer in 100 days ――Day 66 ――Programming ――About natural language processing
You will be an engineer in 100 days ――Day 24 ―― Python ―― Basics of Python language 1
You will be an engineer in 100 days ――Day 30 ―― Python ―― Basics of Python language 6
You will be an engineer in 100 days ――Day 25 ―― Python ―― Basics of Python language 2
You will be an engineer in 100 days --Day 29 --Python --Basics of the Python language 5
You will be an engineer in 100 days --Day 33 --Python --Basics of the Python language 8
You will be an engineer in 100 days --Day 35 --Python --What you can do with Python
You will be an engineer in 100 days --Day 32 --Python --Basics of the Python language 7
You will be an engineer in 100 days --Day 28 --Python --Basics of the Python language 4
When you get an error in python scraping (requests)
You have to be careful about the commands you use every day in the production environment.
What beginners think about programming in 2016