[PYTHON] Image the pdf file and stamp all pages with confidential stamps (images).

This was happening at the company ...

――I want this pdf to be an image → Print it out and scan it (!?) --pdf I want to stamp all pages with confidential stamps → Print and stamp all pages with confidential rubber stamps (!! ??)

Take care of resources. Disassemble pdf & edit image & convert to pdf using python By the way, adobe acrobat also has a stamp function and a watermark function. It may be used for processing that cannot be realized with genuine functions.

Execution environment

Advance preparation

-Install External software Poppler (using poppler-0.67.0_x86) --Add the bin folder in the unzipped file to the environment variable

First folder structure

image.png

This part is executed from cmd

dir_path = 'C:\\Users\\user\\Desktop\\topdf'

import os
from pdf2image import convert_from_path, convert_from_bytes

images = convert_from_path(os.path.join(dir_path,'a.pdf'))

for i in range(0,len(images)):
    images[i].save(os.path.join(dir_path,'png/test'+str(i).zfill(8)+'.png'), 'png')

Execute from notebook below

import glob
import os

dir_path = 'C:\\Users\\user\\Desktop\\topdf'

png_data = glob.glob(os.path.join(dir_path,'png','*.png'))

Check if the image can be read properly Set the image you want to set as a stamp as im2

from PIL import Image, ImageDraw, ImageFilter

im1 = Image.open(png_data[0])
im2 = Image.open(os.path.join(dir_path,'stamp','stamp.png'))

Check the size of the stamp image and enlarge the image When pasting on the image with paste, adjust the coordinates as well

im2.size

back_im = im1.copy()
back_im.paste(im2.resize((216, 72)), (10, 5))

When the adjustment is finished, paste it in multiple batches I'm also worried about the file size, so set it to jpg

for o in range(0,len(png_data)):
    im1 = Image.open(png_data[o])
    im2 = Image.open(os.path.join(dir_path,'stamp','stamp.png'))
    back_im = im1.copy()
    back_im.paste(im2.resize((216, 72)), (10, 5))
    back_im.save(os.path.join(dir_path,'png_stamped',str(o).zfill(8)+'.jpg'), 
        quality=10)

Convert to pdf using img2pdf

import img2pdf

fname =os.path.join(dir_path,'pdf',"output.pdf") 
dirname = os.path.join(dir_path,'png_stamped')

with open(fname,"wb") as f:
    imgs = []
    for fname in os.listdir(dirname):
        if not fname.endswith(".jpg "):
            continue
        path = os.path.join(dirname, fname)
        if os.path.isdir(path):
            continue
        imgs.append(path)
    f.write(img2pdf.convert(imgs))

A pdf file is created in the pdf folder

that's all

No, buy acrobat.

reference

img2pdf PyPl

How to convert pdf to image with python

Paste another image on an image with Pillow

Recommended Posts

Image the pdf file and stamp all pages with confidential stamps (images).
I ran GhostScript with python, split the PDF into pages, and converted it to a JPEG image.
Extract images and tables from pdf with python to reduce the burden of reporting
Convert the image in .zip to PDF with Python
Story of image analysis of PDF file and data extraction
POST the image with json and receive it with flask
[Python] Read the csv file and display the figure with matplotlib
Convert garbled scanned images to PDF with Pillow and PyPDF
Read the VTK file and display the color map with jupyter.