Get a capture of the entire web page in Selenium Python VBA

Get a capture of the entire web page with Selenium Python VBA

point

① Use chrome ② Make it headless ③ Get the vertical and horizontal lengths for each page with javascript, set them, and get a capture ④ If you think that the site will time out suddenly, you can restart Chrome.

Requests for capture acquisition come suddenly, so you need to be careful not to miss a capture quickly. For that purpose, make sure that the capture is done in the specified folder at the end. If it times out, you can restart Chrome and continue to capture. It's the easiest way to write a program quickly. But this is pretty good.

■ Environment ・ Windodws10 ・ Python 3.8.3

Get a capture of the entire web page with python selenium

Get a capture of the entire web page


from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


options = Options()
options.add_argument('--hide-scrollbars') #Turn off the scroll bar
options.add_argument('--incognito')        #Secret mode
options.add_argument('--headless')         #Headless (browser disappears, it's a good idea to add this option after testing)
driver = webdriver.Chrome(options=options) #Description when the path is passing

try:
  WebDriverWait(self.driver, 15).until(EC.presence_of_all_elements_located)
  driver.get("https://testtesttest.com")  
 #↑ Move to the URL to be captured. Time-out errors may suddenly occur frequently on sites with many images and advertisements.
except Exception:
   #Write a description of driver restart here
   

#Get the vertical and horizontal size and get the capture
page_width = driver.execute_script('return document.body.scrollWidth')
page_height = driver.execute_script('return document.body.scrollHeight')        
driver.set_window_size(page_width, page_height)

#Get current time for file name
now = datetime.datetime.now()
zikan = now.strftime('%Y%m%d_%H%M%S')
filename = "file name"+ "_"+ zikan + ".jpg "   #The extension can be ping

#Take a capture
driver.save_screenshot("./Folder name/" + filename)

#Search for a capture file for up to 5 seconds
start=time.time()
while time.time()-start<=5:
  if os.path.exists(./Folder name/+filename):
      break      #Exit when the file is found
      time.sleep(1)
else:
  #Write what to do if the capture is not found

driver.quit()

Get a capture of the entire web page with vba selenium Basic

When operating with vba Since many Excel files are often running, the capture save destination is It is described with a full path instead of a relative path. In my experience, sites with a lot of images and promotions suddenly have frequent timeout errors. Even in that case, if I put in the restart process with my own function OnceMoreGet, I proceeded without stopping.

Get a capture of the entire web page


Dim deiver as New ChromeDriver
Dim tate As Long
Dim yoko As Long
Dim Target as String
driver.AddArgument "headless" 'Headless
driver.AddArgument "disable-gpu" 'Temporarily needed options. It may be unnecessary, but just in case
driver.AddArgument "incognito"   'Secret mode
driver.AddArgument "hide-scrollbars"  'Turn off the scroll bar

'Set the wait time for reading and timeout 30 seconds
driver.Timeouts.PageLoad = 30000        
driver.Timeouts.Server = 30000
driver.Timeouts.ImplicitWait = 30000
driver.Timeouts.Script = 30000
driver.Start

'Simple for light sites that don't time out
'driver.get("URL of the site")OK
'↓ is a function that restarts when a timeout occurs after a screen transition
If Not OnceMoreGet(driver, "URL of the site") Then     
     'If the restart fails after the timeout, write some processing here
End If

'Capture acquisition process
tate = driver.ExecuteScript("return document.body.scrollHeight")
yoko = driver.ExecuteScript("return document.body.scrollWidth")
driver.Window.SetSize yoko, tate
Target = "Enter the file name here with the full path"

'Screenshot: There may be an error here, so it is better to have an error avoidance process.
driver.TakeScreenshot.SaveAs (Target)      

'Use the Dir function to check if the capture is possible.
Dim timeout As Date
timeout = DateAdd("s", 5, Now)
Dim str As String
Do
  str = Dir(Target)
  If Now > timeout Then
     'Write the processing when only the capture file is not found
  End If   
    
Loop Until str <> ""

driver.quit

'A function that restarts when a timeout error occurs
Function OnceMoreGet(driver As ChromeDriver, url As String) As Boolean
  On Error GoTo ErrorHandler
    diver.Get (url)        
    OnceMoreGet = True
    Exit Function        
ErrorHandler:
    'Reboot if an exception occurs
    driver.Quit
    Call WaitFor(3)
    driver.Start
    driver.Get (url)
    OnceMoreGet = True
End Function

An amateur is writing by himself. I would appreciate any advice or comments.

Recommended Posts

Get a capture of the entire web page in Selenium Python VBA
Get the caller of a function in Python
Get the value selected in Selenium Python VBA pull-down
Get the number of specific elements in a python list
How to determine the existence of a selenium element in Python
[python, ruby] fetch the contents of a web page with selenium-webdriver
Get the number of readers of a treatise on Mendeley in Python
[Python] Get the files in a folder with Python
Get the number of searches with a regular expression. SeleniumBasic VBA Python
Make a copy of the list in Python
Output in the form of a python array
Get a glimpse of machine learning in Python
Get a datetime instance at any time of the day in Python
How to get a list of files in the same directory with python
How to get the number of digits in Python
[python] Get the list of classes defined in the module
Get the size (number of elements) of UnionFind in Python
Get the URL of the HTTP redirect destination in Python
A reminder about the implementation of recommendations in Python
[Python3] Take a screenshot of a web page on the server and crop it further
Get the value of a specific key in a list from the dictionary type in the list with Python
[For beginners] Web scraping with Python "Access the URL in the page to get the contents"
Find out the apparent width of a string in python
[Django3] Display a web page in Django3 + WSL + Python virtual environment
Get a list of purchased DMM eBooks with Python + Selenium
Get a Python web page, character encode it, and display it
[Note] Import of a file in the parent directory in Python
How to get the last (last) value in a list in Python
How to get a list of built-in exceptions in python
Find the eigenvalues of a real symmetric matrix in Python
Get the index of each element of the confusion matrix in Python
Get the source of the page to load infinitely with python.
Get the desktop path in Python
Get the value of a specific key up to the specified index in the dictionary list in Python
Get web screen capture with python
Get the script path in Python
[OCI] Python script to get the IP address of a compute instance in Cloud Shell
Hit the web API in Python
I tried to create a Python script to get the value of a cell in Microsoft Excel
Get the desktop path in Python
Get the host name in Python
Try to get a list of breaking news threads in Python.
How to check the memory size of a variable in Python
Read the standard output of a subprocess line by line in Python
How to check the memory size of a dictionary in Python
[Python] Get the update date of a news article from HTML
A function that measures the processing time of a method in python
[python] Get the rank of the values in List in ascending / descending order
How to get the vertex coordinates of a feature in ArcPy
I wrote a script to extract a web page link in Python
Get and set the value of the dropdown menu using Python and Selenium
Create a function to get the contents of the database in Go
Get the formula in an excel file as a string in Python
Get a list of files in a folder with python without a path
Get the title and delivery date of Yahoo! News in Python
Check the behavior of destructor in Python
Write the test in a python docstring
Display a list of alphabets in Python 3
[Python] [Windows] Take a screen capture in Python
Run the Python interpreter in a script
The result of installing python in Anaconda