[PYTHON] Try various things with PhantomJS

What is PhantomJS?

HP: PhantomJS For reference: Why is "WebKit" between Google and Apple so important?

In a nutshell, a Webkit-based Headless browser. Webkit is a rendering engine mainly used in web browsers. PhantomJS uses a javascript engine called JavaScriptCore (which also uses safari) built-in to webkit. It can also be used for scraping and screen capture.

How did you meet PhantomJS?

When I was trying scraping with Python 3.4, I ran into a site with JavaScript rendering. I decided to follow the site [http://qiita.com/beatinaniwa/items/72b777e23ef2390e13f8#comment-8ec96aa8e93ceb9cf3ea) for quick web scraping with Python (while supporting JavaScript loading). ..

Installation

So the installation actually started. $ brew install phantomjs

Experiment #0

I will try various things by referring to Quick Start published on the net. Basically, we will use the Javascript language.

Experiment #1 Hello, World! Move to the directory where you want to project on terminal (or mkdir file name). Create a file with touch hello.js in the project directory and start fleshing with ʻopen hello.js`.

hello.js


console.log('Hello, World!');
phantom.exit();

When I run phantomjs hello.js on the terminal, it returnsHello, World!. The first line, console.log, throws the stringHello, world!Into the terminal. The thrown information is actually executed by the command phantom.exit (). By the way, the code is not processed unless you write the phantom.exit () command.

Experiment #2 Page Loading It is possible to create an object on any web page using PhantomJS and load, analyze, and generate the web page.

page_loading.js


//Make a headless browser
var page = require('webpage').create();

//Open the specified URL
page.open('https://google.com', function(status) {
  console.log("Status: " + status);
  if(status === "success") {
  	//screen capture
    page.render('google.png');
  }
  phantom.exit();
});

If successful, you should see Status: success and you should have a google screen capture in your working directory. Besides, it seems that you can measure the speed of loading pages. For example, if you want to know how fast a page loads to http://www.google.com,

loadspeed.js


var page = require('webpage').create(),
  system = require('system'),
  t, address;

//<>Put the specified URL in
if (system.args.length === 1) {
  console.log('Usage: loadspeed.js <http://www.google.com>');
  phantom.exit();
}

t = Date.now();
address = system.args[1];
page.open(address, function(status) {
  if (status !== 'success') {
    console.log('FAIL to load the address');
  } else {
    t = Date.now() - t;
    console.log('Loading ' + system.args[1]);
    console.log('Loading time ' + t + ' msec');
  }
  phantom.exit();
});

If you run it on phantomjs loadspeed.js http://www.google.com, it will be Loading http://www.google.com The execution result such as Loading time 698 msec is displayed.

Experiment #3 Code Evaluation Basically, you can get JavaScript data on the web using the ʻevaluate ()` function. For example, when you want to get the title on a web page,

//Create a headless browser
var page = require('webpage').create();

//Open URL
page.open('http://www.google.com', function(status) {
  //Get data via JS in the browser
  var title = page.evaluate(function() {
    return document.title;
  });
  console.log('Page title is ' + title);
  phantom.exit();
});

When I run it on the terminal, it says Page title is Google.

Supplemental tryout You can even use PhantomJs to provide driving guidance based on google map.

direction.js


// Get driving direction using Google Directions API.

var page = require('webpage').create(),
    system = require('system'),
    origin, dest, steps;

if (system.args.length < 3) {
    console.log('Usage: direction.js origin destination');
    console.log('Example: direction.js "San Diego" "Palo Alto"');
    phantom.exit(1);
} else {
    origin = system.args[1];
    dest = system.args[2];
    page.open(encodeURI('http://maps.googleapis.com/maps/api/directions/xml?origin=' + origin +
                '&destination=' + dest + '&units=imperial&mode=driving&sensor=false'), function (status) {
        if (status !== 'success') {
            console.log('Unable to access network');
        } else {
            steps = page.content.match(/<html_instructions>(.*)<\/html_instructions>/ig);
            if (steps == null) {
                console.log('No data available for ' + origin + ' to ' + dest);
            } else {
                steps.forEach(function (ins) {
                    ins = ins.replace(/\&lt;/ig, '<').replace(/\&gt;/ig, '>');
                    ins = ins.replace(/\<div/ig, '\n<div');
                    ins = ins.replace(/<.*?>/g, '');
                    console.log(ins);
                });
                console.log('');
                console.log(page.content.match(/<copyrights>.*<\/copyrights>/ig).join('').replace(/<.*?>/g, ''));
            }
        }
        phantom.exit();
    });
}

To an appropriate directory, and execute it on the terminal at phantomjs direction.js departure point arrival point. When I ran it with phantomjs direction.js Tokyo Osaka as a trial,

Head south
Turn right at Tokyo Metropolitan Government South (intersection) toward Metropolitan Road 431
At Tsunohazu Citizens Center (intersection),continue onto Metropolitan Road 431
Turn left at Nishi-Shinjuku 4-chome (intersection) onto Yamate-dori/Route 317
Continue straight to stay on Yamate Dori/Route 317
Take the ramp on the right to Metropolitan Expressway Central Circular Route
Toll road
Continue onto Exit Hatsudai Minami Tollhouse
Toll road
Merge onto Metropolitan Expressway Central Circular Route
Toll road
Take exit Ohashi JCT on the right toward Tomei Expressway Inner Circular
Toll road
Keep left at the fork,follow signs for Tomei and merge onto Metropolitan Expressway No. 3 Shibuya Line
Toll road
Keep right to continue on Tomei Expressway
Toll road
Keep right at the fork to stay on Tomei Expressway,follow signs for right route, Shizuoka, Gotemba
Toll road
Take exit Gotemba JCT toward Shin Tomei / Shizuoka / Nagoya
Toll road
Continue onto Shin Tomei Expressway
Toll road
Continue onto Exit Hamamatsu Inasa JCT
Toll road
Keep right at the fork,follow signs for Tomei / Tokyo / Nagoya and merge onto Shin Tomei Expressway
Toll road
Take exit Mikkabi JCT on the right toward Tomei / Nagoya
Toll road
Merge onto Tomei Expressway
Toll road
Take exit Toyota JCT toward Tokai Kanjo Expressway, Isewangan Expressway, Toyota East Exit, Toki JCT, Yokkaichi, Shin-Meishin Expressway
Toll road
Keep right at the fork,follow signs for Isewangan Expressway, Yokkaichi, Shin-Meishin Expressway and merge onto Isewangan Expressway
Toll road
Take exit Tobishima IC on the right toward Isewangan Expressway
Toll road
Continue onto Isewangan Expressway
Toll road
Take exit Yokkaichi JCT on the right toward Higashi-Meihan Road / Osaka / Ise Road
Toll road
Merge onto Higashi-Meihan Expressway
Toll road
Take exit Kameyama JCT toward Shin-Meishin Expressway, Kyoto, Osaka
Toll road
Continue onto Shin-Meishin Expressway
Toll road
Take exit Kusatsu JCT toward Kusatsu PA / Meishin / Keiji / Kyoto / Osaka
Toll road
Keep right at the fork to continue on Exit Kusatsu PA,follow signs for Meishin and merge onto Meishin Expressway
Toll road
Keep right at the fork to stay on Meishin Expressway,follow signs for right route
Toll road
Take exit Toyonaka IC toward Hanshin Expressway / Toyonaka Exit / Osaka City
Toll road
Keep left at the fork,follow signs for Osaka City / General Road Exit / Hanshin Expressway
Toll road
Keep right at the fork to continue on Exit High-speed Toyonaka IC,follow signs for Hanshin Expressway
Toll road
Continue onto Exit Meishin Hanshin Tollhouse
Toll road
Merge onto Hanshin Expressway No. 11 Ikeda Line
Toll road
Merge onto Hanshin Expressway No. 1 Loop Line
Toll road
Take exit Toshitaka Kitahama toward Kitahama exit
Partial toll road
Turn right at Sugaharacho Nishi (intersection) onto Sakaisuji
Slight right to stay on Sakaisuji
Slight right onto Tenjinbashi
Turn right at Tenjinbashi (intersection) onto Tosabori-dori/Prefectural Road 168
Turn right at Kitahama 2 (intersection) toward Nakanoshima-dori
Turn is not allowed 8:00 AM – 8:00 PM
Turn left toward Nakanoshima-dori
Turn left onto Nakanoshima-dori
Destination will be on the left

Apart from accuracy, it seems to move for the time being. It turns out that you can try various things with PhantomJS alone. Reference: direction.js

Recommended Posts

Try various things with PhantomJS
Personal tips when doing various things with Python 3
Try scraping with Python.
Try SNN with BindsNET
Various colorbars with Matplotlib
Try regression with TensorFlow
I tried various things with Python: scraping (Beautiful Soup + Selenium + PhantomJS) and morphological analysis.
Try to factorial with recursion
Try function optimization with Optuna
Try deep learning with TensorFlow
Try using PythonTex with Texpad.
Try to display various information useful for debugging with python
Try edge detection with OpenCV
Try implementing RBM with chainer.
Try Google Mock with C
Try using matplotlib with PyCharm
Try programming with a shell!
Try GUI programming with Hy
Try an autoencoder with Pytorch
Try matrix operation with NumPy
Try implementing XOR with PyTorch
Try running CNN with ChainerRL
Try Deep Learning with FPGA
One-liner addition with various scripts
Manipulate various databases with Python
Try running Python with Try Jupyter
Try implementing perfume with Go
Various memorandums developed with Choregraphe
Try Selenium Grid with Docker
Try face recognition with Python
Try OpenCV with Google Colaboratory
Try machine learning with Kaggle
Try TensorFlow MNIST with RNN
Try building JupyterHub with Docker
Try using folium with anaconda
Try various Linux distributions with VMware ESXi arm edition + USB boot
Try Deep Learning with FPGA-Select Cucumbers
Try scraping with Python + Beautiful Soup
Reinforcement learning 13 Try Mountain_car with ChainerRL.
Try deep learning with TensorFlow Part 2
Try http-prompt with interactive http access
Try audio signal processing with librosa-Beginner
Try face recognition with Generated Photos
Try horse racing prediction with Chainer
Try to profile with ONNX Runtime
Try machine learning with scikit-learn SVM
Try L Chika with raspberry pi
Try face recognition with python + OpenCV
Try running Jupyter with VS Code
Various Fine Tuning with Mobilenet v2
Try mining Bitcoin with Python's hashlib
Beginner RNN (LSTM) | Try with Keras
Try moving 3 servos with Raspberry Pi
Try frequency control simulation with Python
Try blurring the image with opencv2
Try Common Representation Learning with chainer
Try to output audio with M5STACK
Try data parallelism with Distributed TensorFlow
Try using Python's networkx with AtCoder
[PyStan] Try Graphical Lasso with Stan.
Handle various date formats with pandas