[PYTHON] Use Ghost.py as an alternative to PhantomJS

deck2pdf

This year, I made a CUI tool called deck2pdf that captures HTML slides and converts them to PDF.

Basically, as the method often comes out if you google, "Capture each page to PNG → connect all to make PDF", in version 0

I wrote it in a shared manner. So, I was looking for and found this Ghost.py with the intention of unifying it with the Python package as much as possible.

Get to know Ghost.py

Ghost.py is a WebKit client written in Python with sessions, evaluate, screen captures and more. I haven't made a detailed comparison with PhantomJS etc., but I didn't have any trouble with the minimum behavior (although I was getting stuck), so I used it in my own package as it is.

Try using Ghost.py

Try to install

Ghost.py uses Qt, so you need PySide or PyQt. This time I tried using PySide.

$ brew install qt
$ pip install PySide==1.2.2
$ pyside_postinstall.py -install
$ pip install Ghost.py

Supplement

Actually, the latest version of PySide is 1.2.4 at this point, but since 1.2.4 does not have a Mac wheel, it seems that the build etc. will work and it will take a long time to install. If you don't have PySide installed at this time and want to try it out for the time being, I think it's faster to use 1.2.2 as described above.

Try using

First, start ghost


>>> from ghost import Ghost
>>> ghost = Ghost()
>>> session = ghost.start()

The client-side process runs when you create an instance of Ghost. Create an instance of the session with the start () method.

Visit the HTML5 slides demo page


>>> resp = session.open('http://html5slides.googlecode.com/svn/trunk/template/index.html')
2015-12-20T18:02:12.662Z [WARNING ] QT: libpng warning: iCCP: known incorrect sRGB profile
2015-12-20T18:02:12.746Z [WARNING ] QT: libpng warning: iCCP: known incorrect sRGB profile
>>> type(resp)
<type 'tuple'>
>>> len(resp)
2
>>> resp[0]
<ghost.ghost.HttpResource object at 0x10a99f510>
>>> resp[1]
[<ghost.ghost.HttpResource object at 0x10a99f510>, <ghost.ghost.HttpResource object at 0x10a99f410>, <ghost.ghost.HttpResource object at 0x10a99f610>, <ghost.ghost.HttpResource object at 0x10a99f750>, <ghost.ghost.HttpResource object at 0x10a99f8d0>, <ghost.ghost.HttpResource object at 0x10a99f910>, <ghost.ghost.HttpResource object at 0x10a99fb10>, <ghost.ghost.HttpResource object at 0x10a99fc10>, <ghost.ghost.HttpResource object at 0x10a99fa10>, <ghost.ghost.HttpResource object at 0x10a99fd10>]
>>>
>>> resp[0].url
u'http://html5slides.googlecode.com/svn/trunk/template/index.html'
>>> resp[1][0].url
u'http://html5slides.googlecode.com/svn/trunk/template/index.html'
>>> resp[1][1].url
u'http://html5slides.googlecode.com/svn/trunk/slides.js'
>>> resp[1][2].url
u'http://fonts.googleapis.com/css?family=Open+Sans:regular,semibold,italic,italicsemibold|Droid+Sans+Mono'

It's hard to understand if it's an interactive shell, but it will request and get all the resources referenced in the URL and content specified in session.open (url).

>>> session.capture_to('capture_1.png')

You can take a screenshot with the capture_to method. But,,, capture_1.png

If you don't specify the capture area properly, it will be terrible. Or it may be better to fix the size in advance because the viewport can be set.

>>> session.capture_to('capture_2.png', region=(1940, 0, 3000, 740))
>>>

capture_2.png

A little addicted

This Ghost.py can call js directly in the session.

Move html5slides slides to the next page


>>> session.evaluate('nextSlide()')
(None, [])
>>> session.capture_to('capture_3.png', region=(1940, 0, 3000, 740))

capture_3.png

The slide does not advance even if I execute the function to advance the slide and capture it. (If you execute nextSlide on normal Chrome etc., the slide will proceed without problems)

As I confirmed while making deck2pdf, it seems that the Ghost.py session delegates the time progress to the code outside Ghost.py. Therefore, if you do nothing, even if you call the js code with evaluate, it will not be executed unless the time advances.

python


>>> session.sleep(1)
>>> session.capture_to('capture_4.png', region=(1940, 0, 3000, 740))

Here is the result of sliding for 1 second. I was able to capture the slide contents without any problems. capture_4.png

Summary

Although there are some quirks like this, I can do some things that can be done based on WebKit, so it seemed that I could play various things if I got along well.

It seems that it is troublesome for material such as SpeakerDeck clone with high Python purity or making slides from an archive that summarizes HTML.

Recommended Posts

Use Ghost.py as an alternative to PhantomJS
An alternative to `pause` in Python
Preparing to use Ansible on an existing Linux server
Specify MinGW as the compiler to use with Python
[Tips] How to use iPhone as webcam on Linux
Building an environment to use CaboCha with google colaboratory
How to use xml.etree.ElementTree
How to use Python-shell
How to use tf.data
How to use virtualenv
How to use Seaboan
How to use image-match
How to use Pandas 2
Poetry-An alternative to Pipenv
How to use Virtualenv
How to use pytest_report_header
How to use Bio.Phylo
How to use SymPy
How to use x-means
How to use WikiExtractor.py
How to use IPython
How to use virtualenv
How to use Matplotlib
How to use iptables
How to use numpy
Reasons to use logarithm
How to use TokyoTechFes2015
How to use venv
How to use dictionary {}
How to use Pyenv
Easy to use SQLite3
How to use list []
How to use python-kabusapi
Python-How to use pyinstaller
How to use OptParse
How to use return
How to use dotenv
How to use pyenv-virtualenv
How to use Go.mod
How to use imutils
How to use import
Posted as an attachment to Slack on AWS Lambda (Python)
How to use Fujifilm X-T3 as a webcam on Ubuntu 20.04
It's too easy to use an existing database with Django
How to use cuML SVC as a Gridsearch CV classifier
Use REST API in JIRA (user registration as an example)
How to use a file other than .fabricrc as a configuration file
How to use discrete values as variables in Scipy optimize
Use Colab only as an external GPU environment (as of 2020.6 / Mac environment)
Use Xming to launch an Ubuntu GUI application on Windows.