[PYTHON] Build a capture acquisition machine with Selenium

To create a capture capture server on CentOS6 I tried using Selenium!

First, set up the virtual display so that Firefox can run on the CUI. Virtual display uses Xvfb

# yum -y install Xvfb firefox

Also install firefox

Then set the UUID

# dbus-uuidgen > /var/lib/dbus/machine-id

Preparing to work with Selenium from Python

# cd /usr/local/src
# wget http://peak.telecommunity.com/dist/ez_setup.py
# python ez_setup.py
# wget https://raw.github.com/pypa/pip/master/contrib/get-pip.py
# python get-pip.py
# pip install selenium

Allow Xvfb settings and automatic startup Create rc script

# vi /etc/init.d/xvfb
#!/bin/bash
#
# chkconfig: - 91 35
# description: Xvfb
export DISPLAY="localhost:1.0"
# Source function library.
. /etc/init.d/functions

prog=$"Xvfb"

# Xvfb program
XVFB=/usr/bin/Xvfb
STATUS=":1 -screen 0 1024x768x8"
pidf=/var/run/xvfb.pid

start() {
    if [ -e $pidf ];
    then
        action $"Starting $prog: " /bin/false
        echo "$prog already started"
    else
        action $"Starting $prog: " /bin/true
        $XVFB $STATUS > /dev/null 2>&1 &
        echo $! > $pidf
    fi
}

stop() {
    if [ -e $pidf ];
    then
        action $"Stopping $prog: " /bin/true
        pid=`cat $pidf`
        test ! -z $pid && kill $pid && rm -f $pidf
    else
        action $"Stopping $prog: " /bin/false
        echo "$prog not running"
    fi
}

status() {
    if [ -e $pidf ];
    then
        echo "$prog (pid `cat $pidf`)Is running..."
    else
        echo "$prog not running"
    fi
}

case "$1" in
    start)
        start
        ;;
    stop)
        stop
        ;;
    restart)
        stop
        sleep 1
        start
        ;;
    status)
        status
        ;;
    *)
        echo $"Usave: $0 {start|stop|restart|status}"
        exit 1
esac

exit 0

Register the created rc script as a service

# chmod 755 /etc/init.d/xvfb
# chkconfig --add xvfb
# chkconfig xvfb on
# /etc/init.d/xvfb start

Set virtual display as environment variable

# vi /etc/profile

Add the following

export DISPLAY="localhost:1.0"

Finally create a capture acquisition program

$ vi cap.py
# -*- coding: utf-8 -*-
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
from selenium.common.exceptions import NoAlertPresentException
import unittest, time, re, sys, os
from PIL import Image

try:
    URL    = sys.argv[1]
    FILE   = sys.argv[2]
    DEVICE = sys.argv[3]
except:
    print('Not argv')
    sys.exit(1)

class Cap(unittest.TestCase):
    def setUp(self):
        profile = webdriver.FirefoxProfile()
        profile.set_preference("browser.download.useDownloadDir", True)
        profile.set_preference("browser.download.folderList", 2)
        profile.set_preference("browser.download.dir", os.path.dirname(FILE))
        profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream")
        profile.set_preference("browser.cache.disk.enable", False)
        profile.set_preference("browser.cache.memory.enable", False)
        profile.set_preference("browser.cache.offline.enable", False)
        profile.set_preference("network.http.use-cache", False)

        useragent = ""
        if DEVICE == "sp":
            useragent = "Mozilla/5.0 (iPhone; CPU iPhone OS 8_0 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12A365 Safari/600.1.4"
        else:
            useragent = "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; Trident/6.0)"

        profile.set_preference("general.useragent.override", useragent)

        self.driver = webdriver.Firefox(firefox_profile=profile)
        self.driver.implicitly_wait(100000)
        self.base_url = URL
        self.verificationErrors = []
        self.accept_next_alert = True

        if DEVICE == "sp":
            self.driver.set_window_size(320, 480)

    def test_cap(self):
        driver = self.driver

        driver.get(URL)
        time.sleep(3)
        driver.save_screenshot(FILE)

        # slice image
        # org_im = Image.open(FILE)
        # size   = org_im.size
        # new_im = None
        # height = 800 if size[1] > 800 else size[1]
        # new_im = org_im.crop((0, 0, size[0], height))
        # if DEVICE == "sp":
        #     new_im = org_im.crop((0, 0, size[0], height))
        # else:
        #     new_im = org_im.crop((0, 0, size[0], height))

        # new_im.save(FILE, "PNG")
        # os.chmod(FILE, 0666)
      
    def is_element_present(self, how, what):
        try:
            self.driver.find_element(by=how, value=what)
        except NoSuchElementException, e:
            return False

        return True

    def is_alert_present(self):
        try:
            self.driver.switch_to_alert()
        except NoAlertPresentException, e:
            return False

        return True

    def close_alert_and_get_its_text(self):
        try:
            alert = self.driver.switch_to_alert()
            alert_text = alert.text
            if self.accept_next_alert:
                alert.accept()
            else:
                alert.dismiss()

            return alert_text
        finally:
            self.accept_next_alert = True

    def tearDown(self):
        self.driver.quit()
        self.assertEqual([], self.verificationErrors)

if __name__ == "__main__":
    unittest.main(argv=sys.argv[:1])

How to use

$ python cap.py [URL] [Save destination path] [device(pc or sp)]

If you want to process the captured capture, slice image Gonyo Gonyo below

# yum install fonts-ja* ttfonts-ja*

Collecting a large number of dog images ♪ w

Recommended Posts

Build a capture acquisition machine with Selenium
Build a Python machine learning environment with a container
Build a machine learning application development environment with Python
Build a machine learning environment
Build a machine learning scikit-learn environment with VirtualBox and Ubuntu
Build a deb file with Docker
Build a web application with Django
Easily build a development environment with Laragon
Build a Tensorflow environment with Raspberry Pi [2020]
A story about machine learning with Kyasuket
Build a Fast API environment with docker-compose
Build a python virtual environment with pyenv
Build a modern Python environment with Neovim
Build static library (.a) together with waf
Build AI / machine learning environment with Python
[Linux] Build a Docker environment with Amazon Linux 2
[EC2] How to take a screen capture of your smartphone with selenium
Build a local server with a single command [Mac]
Scraping with selenium
Build a C language development environment with a container
Scraping with selenium ~ 2 ~
Build a python environment with ansible on centos6
[Python] Build a Django development environment with Docker
Build a cheap summarization system with AWS components
Build a Django environment with Vagrant in 5 minutes
Scraping with Selenium
Beginning with Selenium
[Memo] Build a virtual environment with Pyenv + anaconda
Build a Django development environment with Doker Toolbox
Build a Python environment with OSX El capitan
Build a python execution environment with VS Code
Run a machine learning pipeline with Cloud Dataflow (Python)
Build a python virtual environment with virtualenv and virtualenvwrapper
Build a machine learning Python environment on Mac OS
Let's feel like a material researcher with machine learning
I can't manipulate iframes in a page with Selenium
I tried a simple RPA for login with selenium
Build a development environment with Poetry Django Docker Pycharm
Build a Django environment for Win10 (with virtual space)
Create a machine learning environment from scratch with Winsows 10
Build a machine learning environment natively on Windows 10 (x64)
Build a numerical calculation environment with pyenv and miniconda3
Successful scraping with Selenium
A4 size with python-pptx
ScreenShot with Selenium (Python)
Build python3.x with pyenv
Scraping with Selenium [Python]
Decorate with a decorator
Build a python machine learning study environment on macOS sierra
Build a machine learning environment on mac (pyenv, deeplearning, opencv)
Try to build a deep learning / neural network with scratch
Build a Django development environment with Docker! (Docker-compose / Django / postgreSQL / nginx)
Introduction to Machine Learning with scikit-learn-From data acquisition to parameter optimization
[Memo] Build a development environment for Django + Nuxt.js with Docker
Create a machine learning app with ABEJA Platform + LINE Bot
Build a Go development environment with VS Code's Remote Containers
(Now) Build a GPU Deep Learning environment with GeForce GTX 960
Get a list of purchased DMM eBooks with Python + Selenium
[Django] Build a Django container (Docker) development environment quickly with PyCharm
Build a bulletin board app from scratch with Django. (Part 2)
Build a comfortable development environment with VSCode x Remote Development x Pipenv