How to add page numbers to PDF files (in Python)

Table of Contents

  1. [Introduction](# orga18657e)
  2. ReportLab PDF Library (# org7441e75)
  3. [Combine ReportLab + pdfrw](# org2258be2)
  4. [Usage](# orgefcf672)
  5. [Customization method](# orga183ab3)

Introduction

The daughter of an elementary school student scans the wrong problems in math and puts them together in PDF. However, I intended to print only the necessary pages, but the page numbers are quite different, and I have to redo it many times. It is surprisingly analog that I intended to digitize it. Therefore, I wanted to automatically assign page numbers to PDF files. You can prevent the page number from shifting due to human error, or you will notice it immediately.

However, even though I want to number pages in PDF, I can't find a surprisingly good way to do it. As a result of searching, I found some web services and tried them, but some do not like the position to put the page, and some are charged if there are many page numbers (and monthly subscription) And so on.

So I tried to find out if PDFs could be page numbered with Python, which has recently become available and feels good, but there seems to be no PDF library for this use case (at least on its own). I found out that.

This article was also posted on https://achiwa912.github.io/.

ReportLab PDF Library

The most famous PDF library in Python seems to be around PyPDF2, pdfrw. These are good at merging multiple PDF files, splitting them in reverse, and swapping pages, but they do not seem to support the use case of "adding page numbers to existing PDF files". is.

Upon further investigation, I found that the ReportLab library seemed to be able to number pages. https://www.blog.pythonlibrary.org/2013/08/12/reportlab-how-to-add-page-numbers/

This web page is a very promising title, but the sample code raises questions. In the first place, it seems that the existing PDF file is not read and the page number is given to the newly created PDF page. It's no good. .. ..

I will also read through the manual. https://www.reportlab.com/docs/reportlab-userguide.pdf

Again, there was no explanation for reading an existing PDF file.

Combine ReportLab + pdfrw

Still, when I continued searching without giving up, I found it. https://stackoverflow.com/questions/28281108/reportlab-how-to-add-a-footer-to-a-pdf-file

As expected, stackoverflow! I love it along with qiita in Japan. Apparently, it says that it seems possible to combine ReportLab and pdfrw. There is also a description that is worrisome. .. ..

DISCLAIMER: Tested on Linux using as input file a pdf file generated by Reportlab. It would probably not work in an arbitrary pdf file.

"I tested it with a PDF file created with ReportLab, but I think it doesn't work with any PDF file."

... Eh !! But this is the only thing I can rely on. Let's try it.

Let's modify the sample code on the stakoverflow page.

from reportlab.pdfgen.canvas import Canvas
from pdfrw import PdfReader
from pdfrw.toreportlab import makerl
from pdfrw.buildxobj import pagexobj
import sys
import os

if len(sys.argv) != 2 or ".pdf" not in sys.argv[1].lower():
    print(f"Usage: python {sys.argv[0]} <pdf filename>")
    sys.exit()
input_file = sys.argv[1]
output_file = os.path.splitext(sys.argv[1])[0] + "_pgn.pdf"

reader = PdfReader(input_file)
pages = [pagexobj(p) for p in reader.pages]

canvas = Canvas(output_file)

for page_num, page in enumerate(pages, start=1):
    canvas.doForm(makerl(canvas, page))

    footer_text = f"{page_num}/{len(pages)}"
    canvas.saveState()
    canvas.setStrokeColorRGB(0, 0, 0)
    canvas.setFont('Times-Roman', 14)
    canvas.drawString(290, 10, footer_text)
    canvas.restoreState()
    canvas.showPage()

canvas.save()

And when I run it ... pdfpage.png That, it moved quickly. Just in case, 7/88 written at the bottom of the page is the page number I put in this time. What was that disclaimer? .. ..

how to use

Since we are using f-string, please use it with python 3.6 or later.

PDF library installation

pip install reportlab
pip install pdfrw

Save the above code as addpagenum.py. (Change the file name to your liking)

Run

python addpagenum.py <pdf_filename>

The page number is A4 and is displayed at the bottom center of the page.

How to customize

Please change this area appropriately.

footer_text = f"{page_num}/{len(pages)}"
canvas.setFont('Times-Roman', 14)
canvas.drawString(290, 10, footer_text)

--Change footer \ _text to change the displayed content --If you want to change the page number display position, change x = 290 and y = 10.

In the canvas of ReportLab, the coordinates (x = 0, y = 0) are at the bottom left of the page. If you want to use a Letter other than A4, specify it when creating a canvas object. See the ReportLab manual for details.

Recommended Posts

How to add page numbers to PDF files (in Python)
How to get the files in the [Python] folder
Add page number to PDF
How to develop in Python
How to convert floating point numbers to binary numbers in Python
How to download files from Selenium in Python in Chrome
[Python] How to do PCA in Python
Convert markdown to PDF in Python
How to collect images in Python
How to use Mysql in python
How to wrap C in Python
How to use ChemSpider in Python
How to use PubChem in Python
How to handle Japanese in Python
[Introduction to Python] How to use class in Python?
How to access environment variables in Python
[Python] How to display random numbers (random module)
How to dynamically define variables in Python
How to do R chartr () in Python
Convert files written in python etc. to pdf with syntax highlighting
[Itertools.permutations] How to put permutations in Python
How to use functions in separate files Perl and Python versions
How to get a stacktrace in python
How to display multiplication table in python
How to extract polygon area in Python
How to check opencv version in python
How to switch python versions in cloud9
How to adjust image contrast in Python
How to use __slots__ in Python class
How to dynamically zero pad in Python
How to add python module to anaconda environment
How to use regular expressions in Python
How to display Hello world in python
How to read CSV files in Pandas
How to use is and == in Python
How to write Ruby to_s in Python
[Python] Add comments to standard input files
How to put a half-width space before letters and numbers in Python.
How to use the C library in Python
How to receive command line arguments in Python
Convert FBX files to ASCII <-> BINARY in Python
[REAPER] How to play with Reascript in Python
How to clear tuples in a list (Python)
How to generate permutations in Python and C ++
How to embed a variable in a python string
How to implement Discord Slash Command in Python
How to use Python Image Library in python3 series
How to create a JSON file in Python
Summary of how to use MNIST in Python
How to add a Python module search path
How to specify TLS version in python requests
How to convert SVG to PDF and PNG [Python]
How to check / extract files in RPM package
To add a module to python put in Julialang
How to notify a Discord channel in Python
How to use tkinter with python in pyenv
How to run Leap Motion in non-Apple Python
Batch convert PSD files in directory to PDF
Get files, functions, line numbers running in python
[Python] How to draw a histogram in Matplotlib
How to output "Ketsumaimo" as standard output in Python