[PYTHON] Split PDF into arbitrary pages

In the first place

Open a huge PDF of pages that appear regularly in Chrome, and at the direction of your boss, 2 pages each (2 pages are one set of statistical data divided by day), and the tearful efforts of colleagues who save daily Look, let's automate it with Python and go to lunch, so let's make it with Python.

PyPDF2 It seems that there are various modules for PDF operation, but it seems easy, so I used a module called "PyPDF2".

pip install PyPDF2

Then enter. If you read this and that, there seems to be a way to divide the PDF into pages one by one. To cut out every page ...

  1. Disassemble everything once and reorganize into 2 pages each
  2. Extract pages 1-2, 3-4, etc.

However, 1 seems to be troublesome, so try 2. Divide "test \ .pdf" into 2 pages and save as " .pdf".

pdf_separate.py


import PyPDF2

f = 'test.pdf' #PDF you want to split
page_sep = 2 #How many pages do you want to divide

#Understand the number of pdf pages
reader = PyPDF2.PdfFileReader(f)
page_num = reader.getNumPages()

#Extract the page and derive the number used for the file name and turn it with for
for page in range(0, page_num, page_sep):
    merger = PyPDF2.PdfFileMerger()
    start = page
    end = start + page_sep
    merger.append(f, pages=(start,end))
    file_name = str(start) + '.pdf'
    merger.write(file_name)
    merger.close

print('the end')

by the way

page_sep = 2 #How many pages do you want to divide

If you change "2" to "3", it will be divided into 3 pages, but if the total pages of the pdf are not multiples of 3, "Hey, if you try to write the last file, there are not enough pages. ", I get an error when generating the last file.

In that case, I can do something about it, but I don't need it now, so I'll go back and eat some rice.

Recommended Posts

Split PDF into arbitrary pages
Renumber PDF pages
PDF processing related (split, etc.)
Split screen into 3 with keyhac
Convert A4 PDF to A3 every 2 pages
[Blender] Split Blender script into multiple files
I ran GhostScript with python, split the PDF into pages, and converted it to a JPEG image.
Split iterator into chunks in python