[PYTHON] I made a CLI tool to convert images in each directory to PDF

Github

:octocat: https://github.com/ikota3/image_utilities

Overview

In order to make self-catering comfortable, we have created a tool to convert the images stored in each directory to PDF. Also, subdirectories are created recursively.

The operation was confirmed on Windows 10.

Things necessary

How to use

$ git clone https://github.com/ikota3/image_utilities
$ cd image_utilities
$ pip install -r requirements.txt
$ python src/images_to_pdf.py convert -i "path/to/input" -o "path/to/output" -e "jpg,jpeg,png"

Implementation details

This time, I thought it would be difficult to make everything from scratch to make a CLI tool, so I used the library fire that was talked about before. ..

In fact, it was very easy to use.

Allows you to hit commands

First, create a skeleton so that you can hit the command and receive the input value. This time, create a class and call it with fire. In addition to classes, fire can call functions, modules, objects, and much more. See the official documentation for details. https://github.com/google/python-fire/blob/master/docs/guide.md

images_to_pdf.py


import fire

class PDFConverter(object):
    """Class for convert images to pdf."""

    def __init__(
            self,
            input_dir: str = "",
            output_dir: str = "",
            extensions: Union[str, Tuple[str]] = None,
            force_write: bool = False,
            yes: bool = False
    ):
        """Initialize

        Args:
            input_dir (str): Input directory. Defaults to "".
            output_dir (str): Output directory. Defaults to "".
            extensions (Union[str, Tuple[str]]): Extensions. Defaults to None.
            force_write (bool): Flag for overwrite the converted pdf. Defaults to False.
            yes (bool): Flag for asking to execute or not. Defaults to False.
        """
        self.input_dir: str = input_dir
        self.output_dir: str = output_dir
        if not extensions:
            extensions = ('jpg', 'png')
        self.extensions: Tuple[str] = extensions
        self.force_write: bool = force_write
        self.yes: bool = yes

	def convert(self):
		print("Hello World!")


if __name__ == '__main__':
    fire.Fire(PDFConverter)

For the time being, the skeleton is completed. If you type the command in this state, it should output Hello World!.

$ python src/images_to_pdf.py convert
Hello World!

Also, other parameters such as ʻinput_dir =" "` have default values, but if you do not pass a value on the command side without setting this, an error on the fire side will occur.

To pass the value, just add a hyphen before the prefix of the argument set in __init__ and then write the value you want to pass.

The method of passing the following commands is the same, although there are differences in the writing method.

$ # self.input_dir example
$ python src/images_to_pdf.py convert -i "path/to/input"
$ python src/images_to_pdf.py convert -i="path/to/input"
$ python src/images_to_pdf.py convert --input_dir "path/to/input"
$ python src/images_to_pdf.py convert --input_dir="path/to/input"

Also, I was confused when I tried to pass the list as a stumbling block.

$ # self.Examples of extensions
$ python src/images_to_pdf.py convert -e jpg,png # OK
$ python src/images_to_pdf.py convert -e "jpg,png" # OK
$ python src/images_to_pdf.py convert -e "jpg, png" # OK
$ python src/images_to_pdf.py convert -e jpg, png # NG

Input value check

Before performing PDF conversion processing, type check by ʻis instance ()` and check whether the specified path exists, etc. are performed.

images_to_pdf.py


def _input_is_valid(self) -> bool:
    """Validator for input.

    Returns:
        bool: True if is valid, False otherwise.
    """
    is_valid = True

    # Check input_dir
    if not isinstance(self.input_dir, str) or \
            not os.path.isdir(self.input_dir):
        print('[ERROR] You must type a valid directory for input directory.')
        is_valid = False

    # Check output_dir
    if not isinstance(self.output_dir, str) or \
            not os.path.isdir(self.output_dir):
        print('[ERROR] You must type a valid directory for output directory.')
        is_valid = False

    # Check extensions
    if not isinstance(self.extensions, tuple) and \
            not isinstance(self.extensions, str):
        print('[ERROR] You must type at least one extension.')
        is_valid = False

    # Check force_write
    if not isinstance(self.force_write, bool):
        print('[ERROR] You must just type -f flag. No need to type a parameter.')
        is_valid = False

    # Check yes
    if not isinstance(self.yes, bool):
        print('[ERROR] You must just type -y flag. No need to type a parameter.')
        is_valid = False

    return is_valid

Scan the directory, collect images, and convert to PDF

I am using something called ʻos.walk () to scan the directory from the received ʻinput_dir path. https://docs.python.org/ja/3/library/os.html?highlight=os walk#os.walk

The directory is scanned as shown below, images are collected, and PDF is created.

images_to_pdf.py


def convert(self):
    #To the prefix of the extension.Add
    extensions: Union[str | Tuple[str]] = None
    if isinstance(self.extensions, tuple):
        extensions = []
        for extension in self.extensions:
            extensions.append(f'.{extension}')
        extensions = tuple(extensions)
    elif isinstance(self.extensions, str):
        extensions = tuple([f'.{self.extensions}'])

	#Scan directories and convert images in each directory to PDF
    for current_dir, dirs, files in os.walk(self.input_dir):
        print(f'[INFO] Watching {current_dir}.')

        #A list that stores the path where the target image is located
        images = []

        #files is current_List of files in dir
        #Sorting will be out of order if the number of digits is different(https://github.com/ikota3/image_utilities#note)
        #Therefore, I prepared a function to make it as expected.(See below)
        for filename in sorted(files, key=natural_keys):
            if filename.endswith(extensions):
                path = os.path.join(current_dir, filename)
                images.append(path)

		#When there is no image as a result of scanning
        if not images:
            print(
                f'[INFO] There are no {", ".join(self.extensions).upper()} files at {current_dir}.'
            )
            continue

        pdf_filename = os.path.join(
            self.output_dir, f'{os.path.basename(current_dir)}.pdf'
        )

		# -If there is an f parameter, forcibly overwrite even if there is a file
        if self.force_write:
            with open(pdf_filename, 'wb') as f:
                f.write(img2pdf.convert(images))
            print(f'[INFO] Created {pdf_filename}!')
        else:
            if os.path.exists(pdf_filename):
                print(f'[ERROR] {pdf_filename} already exist!')
                continue

            with open(pdf_filename, 'wb') as f:
                f.write(img2pdf.convert(images))
            print(f'[INFO] Created {pdf_filename}!')

When collecting the images in the directory, there were times when the sort was wrong and the order was not what I expected, so I prepared a function. Also, I made it by referring to the following link. https://stackoverflow.com/questions/5967500/how-to-correctly-sort-a-string-with-a-number-inside

sort_key.py


import re
from typing import Union, List


def atoi(text: str) -> Union[int, str]:
    """Convert ascii to integer.

    Args:
        text (str): string.

    Returns:
        Union[int, str]: integer if number, string otherwise.
    """
    return int(text) if text.isdigit() else text


def natural_keys(text: str) -> Union[List[int], List[str]]:
    """Key for natural sorting

    Args:
        text (str): string

    Returns:
        Union[List[int], List[str]]: A list of mixed integer and strings.
    """
    return [atoi(c) for c in re.split(r'(\d+)', text)]

the end

I used to make CLI tools without using a library, but it was easier to implement than fire!

Would you like to make a CLI tool?

Recommended Posts

I made a CLI tool to convert images in each directory to PDF
I want to convert a table converted to PDF in Python back to CSV
I made a tool to compile Hy natively
I made a network to convert black and white images to color images (pix2pix)
I made a script in python to convert .md files to Scrapbox format
Batch convert PSD files in directory to PDF
I made a tool to convert Jupyter py to ipynb with VS Code
I made a tool to get new articles
I made a module in C language to filter images loaded by Python
I made a script to put a snippet in README.md
I made a code to convert illustration2vec to keras model
I wrote a CLI tool in Go language to view Qiita's tag feed with CLI
I made a tool to create a word cloud from wikipedia
[Titan Craft] I made a tool to summon a giant to Minecraft
I made a script in Python to convert a text file for JSON (for vscode user snippet)
Convert markdown to PDF in Python
Convert A4 PDF to A3 every 2 pages
I made a program to collect images in tweets that I liked on twitter with Python
A tool to convert Juniper config
I made a command to display a colorful calendar in the terminal
I want to print in a comprehension
I made a payroll program in Python!
How to convert csv to tsv in CLI
I made a script to display emoji
I made a browser automatic stamping tool.
I created a password tool in Python.
I made a plugin to generate Markdown table from csv in Vim
I wrote a code to convert quaternions to z-y-x Euler angles in Python
I made a tool to automatically browse multiple sites with Selenium (Python)
I made a kind of simple image processing tool in Go language.
I made a tool that makes decompression a little easier with CLI (Python3)
I made a program to check the size of a file in Python
A note I looked up to make a command line tool in Python
I made a tool to estimate the execution time of cron (+ PyPI debut)
I made a useful tool for Digital Ocean
Convert PDFs to images in bulk with Python
I want to create a window in Python
I made a tool to notify Slack of Connpass events and made it Terraform
I made a tool to easily display data as a graph by GUI operation.
I made an appdo command to execute a command in the context of the app
I made a router config collection tool Config Collecor
Convert images in multiple folders to different pdfs for each folder at once
I made a tool to generate Markdown from the exported Scrapbox JSON file
I made a tool to automatically back up the metadata of the Salesforce organization
I tried to implement what seems to be a Windows snipping tool in Python
How to read a file in a different directory
I made a Caesar cryptographic program in Python.
I made a tool in Python that right-clicks an Excel file and divides it into files for each sheet.
I want to save a file with "Do not compress images in file" set in OpenPyXL
I can't sleep until I build a server !! (Introduction to Python server made in one day)
I want to easily implement a timeout in python
I made a prime number generation program in Python
I made a user management tool for Let's Chat
I want to transition with a button in flask
I made a library to separate Japanese sentences nicely
I want to write in Python! (2) Let's write a test
I made a cleaning tool for Google Container Registry
I tried to implement a pseudo pachislot in Python
I made a Python module to translate comment outs
Convert the image in .zip to PDF with Python
I want to randomly sample a file in Python