Simplify PDF password unlock with python + bat

environment

Windows 10 (I'm not familiar with Mac / Linux, so I can't support it) python 3.7

table of contents

--Motivation and overview --Directory structure and file contents --Dependent packages --Commentary --Actual usage - Tips

Motivation and overview

Due to the influence of the coronavirus, I think that many online classes are being held at universities. I think that most of the lesson materials for that purpose are distributed using the services on campus, but there are teachers who can sometimes put a password on them. It's not good to redistribute it, and you probably want to reject browsing by non-students, but for this, it's quite annoying to enter the password every time you open ***. *** *** Therefore, let's create a PDF with the password unlocked. If you search for "PDF password unlock" etc., you will find various things, but this time, you can unlock the PDF just by dragging and dropping ***. That's why we go through a two-step process.

--Link bat file and python file --Unlock the PDF password (if known)

It can be realized with python alone, but I think that bat is necessary to unlock the password just by dragging and dropping. So let's use python to unlock the PDF password.

Directory structure and file contents

First, the directory structure and the implemented file contents are shown. Interpret the class folder as the folder where the lesson materials are placed.

root/  ├ main.py  └ decoder.bat

class/  ├ password.txt  └ target.pdf

main.py


import sys, pathlib

from pikepdf import Pdf


def get_password(folder):
    passfile = None

    found : bool = False
    #First search in the same folder
    for f in folder.glob("*"):
        if "password" == f.stem:
            passfile = f
            found = True

    #If they are not on the same level, search for another level above
    if not found:
        folder = folder.parent
        for f in folder.glob("*"):
            if "password" == f.stem:
                passfile = f

    #If there is no password file in the next higher hierarchy
    if not found:
        return None
    
    #Extract password from file
    with open(passfile, mode="r", encoding="utf-8") as f:
        password = f.read()

    return password

def main():
    #When starting from bat
    try:
        path = sys.argv[1]
    #When running a py file alone
    except:
        print("input target path")
        path = input()

    path = pathlib.Path(path)
    #Reject other than PDF files
    if path.suffix != ".pdf":
        print("Only PDF file accept.")
        sys.exit()
    
    password = get_password(folder=path.parent)
    #If the password file does not exist
    if password == None:
        print("No password file. input password = ", end="")
        password = input()
    try:
        pdf = Pdf.open(path, password=password)
    except:     #If the password is incorrect
        print("failed to open PDF. check password.")
        sys.exit()

    pdf_unlock = Pdf.new()
    pdf_unlock.pages.extend(pdf.pages)
    #Save at the same level as the original PDF
    newname = f"{path.parent / path.stem}-unlocked.pdf"
    pdf_unlock.save(newname)


if __name__ == '__main__':
    main()

decoder.bat


@echo off
Rewrite the rem class part with the full path of the class folder
cd "class"
"main.py" %*

Dependent packages

You can pip. Easy.

pip install pikepdf

Use the library pikepdf to unlock the PDF password. Existed by reference [1]. A sample program is also available here, so please refer to it.

Commentary

About cooperation between bat file and python

If you drag and drop a file such as an image file to the bat file to open it, at the command prompt

C:\Users\user> decoder.bat (Full path of the file)

It looks like. This argument can be received as% * in the bat file. (Actually,% 0 is better ...) (Refer to reference [2] for details)

Pass this argument as is to python. In python you can accept command line arguments as sys. Reference [3] describes the minimum. So, sys.argv [1] contains the argument (file path) passed from the bat file this time.

Remove PDF password

The basic steps are as follows. All variable names are the same as main.py

Open PDF file with password
pdf = Pdf.open(path, password=password)
Prepare a new PDF file
pdf_unlock = Pdf.new()
Copy the pre-opened PDF to a new PDF
pdf_unlock.pages.extend(pdf.pages)
Save new PDF
pdf_unlock.save(newname)
The newly saved PDF does not have a password, so it means that you have achieved your goal.

Actually use

1. Prepare the password.
Write the password of target.pdf in password.txt in the directory structure diagram. At this time, as a caveat, do not start a new line. Since the passwords given by most teachers are the same every time, we adopted a mechanism to prepare a password file.
You can type in the command prompt without preparing it, so if the password is different each time, type it at runtime without creating password.txt.

2. Create shortcut
It's a good idea to put a shortcut for decoder.bat on your desktop. It seems more convenient to be able to execute it from the desktop. It is not required separately.

3. Execute
Drag and drop the PDF file onto the shortcut created in step 2. Then the command prompt will be launched (for a moment) and the password-unlocked PDF file will be saved in the same location as the original file.
If you have not created a password file, you will be prompted for it at the command prompt. Please enter.
~~ In addition, the line break prohibition of password.txt can be solved by implementing so that only the first line is read on the python side ~~

Tips

Operation of python file alone

It works with main.py alone. In that case, you will be prompted for the (absolute) path where the PDF exists. At this time, you can also drag and drop the PDF file to the command prompt to work.

About password.txt

If you prepare two files as shown below, the one closest to the PDF file will be given priority.

class/  ├ password.txt  └ 0605/    ├ password.txt    └ target.pdf

If you are a teacher who changes your password depending on the day, you may want to create password.txt in the folder for each day.

Notes

--Unlock password Never redistribute PDF files to teachers without permission --Don't give your password to others.

Postscript

It doesn't seem to work if the password contains double-byte characters.

Summary

――Python is convenient! !! !! ――I'm glad if you like it

References

[1] [Python] Let's unlock the password for the pdf file! [2] Pass arguments when executing batch file [3] Python: What are command line arguments?

Recommended Posts

Simplify PDF password unlock with python + bat
Integrate PDF files with Python
Password management with python: keyring
[Python] Generate a password with Slackbot
Password generation in texto with python
FizzBuzz with Python3
Scraping with Python
Library comparison summary to generate PDF with Python
Convert PDF to image (JPEG / PNG) with Python
Statistics with python
Scraping with Python
Python with Go
[Automation] Extract the table in PDF with Python
Twilio with Python
Integrate with Python
Play with 2016-Python
AES256 with python
Tested with Python
python starts with ()
Experiment with NIST 800-63B password rules in Python
with syntax (Python)
Bingo with python
Zundokokiyoshi with python
Password the PDF
Excel with Python
Microcomputer with Python
Cast with python
Convert the image in .zip to PDF with Python
I tried to automatically generate a password with Python3
Try to automate pdf format report creation with Python
Serial communication with Python
Zip, unzip with python
Django 1.11 started with Python3.6
Primality test with Python
Socket communication with Python
Data analysis with python 2
Try scraping with Python.
Learning Python with ChemTHEATER 03
Sequential search with Python
"Object-oriented" learning with python
Run Python with VBA
Handling yaml with python
Solve AtCoder 167 with python
Serial communication with python
Output PDF with Django
[Python] Use JSON with Python
Learning Python with ChemTHEATER 05-1
Run prepDE.py with python3
1.1 Getting Started with Python
Collecting tweets with Python
Binarization with OpenCV / Python
Rasterize PDF in Python
Kernel Method with Python
Non-blocking with Python + uWSGI
Posting tweets with python
Drive WebDriver with python
Use mecab with Python3
[Python] Redirect with CGIHTTPServer
Voice analysis with python
Think yaml with python
Operate Kinesis with Python