■ Background

I wondered if I could easily make an OCR application that starts on the desktop, so I made it.

■ Environment

· MacOS Catalina 10.15.4 ・ Visual Studio Code ・ Python3.8.1

■ Library installation

1. PysimpleGUI (Create GUI)

pip install pysimplegui

２. Tesseract OCR(OCR)

brew install tesseract

pyocr (OCR tool wrapper for Python)

sudo pip3 install pyocr

pillow (loading image)

sudo pip3 install pillow

● What are these libraries? Including that, I have greatly referred to the following sites for OCR (thanks). Try simple OCR with Tesseract + PyOCR

● This may be easier to understand for Windows users. [Python] How to transcribe an image and convert it to text (tesseract-OCR, pyocr)

Now let's see how to write the code to read the characters from the image (importing the library is included in the whole source code at the end, so I will omit it here).

def scan_file_to_str(file_path, langage):
   """read_file_to_str

Generate a string from an image file

        Args:
            file_path(str):File path to read
            langage(str): 'jpn'Or'eng'

        Returns:
Read character string
   """
   tools = pyocr.get_available_tools()
   if len(tools) == 0:
      print("No OCR tool found")
      sys.exit(1)

   tool = tools[0]

   text = tool.image_to_string(
      #Open the file sent as an argument
      Image.open(file_path),
      #Specify the language sent as an argument('jpn'Or'eng')
      lang=langage,
      builder=pyocr.builders.TextBuilder(tesseract_layout=6)
   )
   #Finally returns the string read from the image
   return text

It's really surprising that you can read a character string from an image in just 15 lines. I was impressed.

Next, I will put this on the GUI. I think tkinter is famous when it comes to Python GUI. I used to write code using tkinter at first, but when I was doing the research, I came across the following article.

[If you use Tkinter, try using PySimpleGUI](https://qiita.com/dario_okazaki/items/656de21cab5c81cabe59#exe%E5%8C%96%E3%81%AB%E3%81%A4 % E3% 81% 84% E3% 81% A6)

I was also impressed by the fact that the GUI could be implemented with simple code, so I decided to use it.

Here is the code for the GUI part.

#Set theme(There are many themes)
sg.theme('Light Grey1')

#Where and what to place(I think it will be easier to assemble if you know that it is arranged in units of lines.)
layout = [
    #The first line(Text:Put the text)
    [sg.Text('File to read(Multiple selections possible)', font=('IPA Gothic', 16))],

    #2nd line(InputText:Text box, FilesBrowse:File dialog)
    [sg.InputText(font=('IPA Gothic', 14), size=(70, 10),), sg.FilesBrowse('Select files', key='-FILES-'),],

    #3rd line(Text:text, Radio:Radio button x 2)
    [sg.Text('Language to read', font=('IPA Gothic', 16)), 
    sg.Radio('Japanese', 1, key='-jpn-', font=('IPA Gothic', 10)),
    sg.Radio('English', 1, key='-eng-', font=('IPA Gothic', 10))],

    #4th line(Button:button)
    [sg.Button('Read execution'),],

    #5th line(MLine:100 columns x 30 rows textarea)
    [sg.MLine(font=('IPA Gothic', 14), size=(100,30), key='-OUTPUT-'),]
]

#Get window(The argument of Window is "Title, Layout")
window = sg.Window('Easy OCR', layout)

#List to put the read files
files = []

#Now turn an infinite loop and wait for an event such as a button click.
while True:
    event, values = window.read()
    #None is the "✕" button in the window. When this is pressed, it breaks out of the loop and closes the window.
    if event == None:
        break
    
    # 'Read execution'When the button is pressed
    if event == 'Read execution':
        # key='-FILES-'The value of InputText specified in';'Get a list of filenames separated by
        files.extend(values['-FILES-'].split(';'))
        #Radio buttons are values['-jpn-']Then language is'jpn',Otherwise'eng'
        language = 'jpn' if values['-jpn-'] else 'eng'
        text = ''
        #Loop by the number of files
        for i in range(len(files)):
            if not i == 0:
                #There is a delimiter for each file
                text += '================================================================================================\n'
                #The scan defined earlier here_file_to_Receive the read string with str method
                text += scan_file_to_str(files[i], language)
         
                if language == 'jpn':
                #In the case of Japanese character strings, there was a lot of extra space, so I deleted it.
                text = text.replace(' ', '')
                #Leave two lines apart from the string in the next file
                text += '\n\n'
        #Read data(=text)Key='-OUTPUT-'Display on the MLine specified in
        window.FindElement('-OUTPUT-').Update(text)
        #Inform the end with a pop-up window
        sg.Popup('Has completed')

window.close()

Regarding the GUI, there are some other things that I have referred to a lot, so I will post them.

・ Learning Notes for K-TechLabo Seminar → The PDF text is very easy to understand. -Create a UI that replaces VBA with PySimpleGUI (file dialog, list, log output) → The same person as the article introduced earlier is written. I also learned from here.

■ Source code (completed)

import os
import sys
from PIL import Image

import PySimpleGUI as sg
import pyocr
import pyocr.builders


def scan_file_to_str(file_path, langage):
   """read_file_to_str

Generate a string from an image file

        Args:
            file_path(str):File path to read
            langage(str): 'jpn'Or'eng'

        Returns:
Read character string
   """
   tools = pyocr.get_available_tools()
   if len(tools) == 0:
      print("No OCR tool found")
      sys.exit(1)

   tool = tools[0]

   text = tool.image_to_string(
      Image.open(file_path),
      lang=langage,
      builder=pyocr.builders.TextBuilder(tesseract_layout=6)
   )
   return text


#Set theme
sg.theme('Light Grey1')

layout = [
   #The first line
   [sg.Text('File to read(Multiple selections possible)', font=('IPA Gothic', 16))],
   #2nd line
   [sg.InputText(font=('IPA Gothic', 14), size=(70, 10),), sg.FilesBrowse('Select files', key='-FILES-'),],
   #3rd line
   [sg.Text('Language to read', font=('IPA Gothic', 16)), 
   sg.Radio('Japanese', 1, key='-jpn-', font=('IPA Gothic', 10)),
   sg.Radio('English', 1, key='-eng-', font=('IPA Gothic', 10))],
   #4th line
   [sg.Button('Read execution'),],
   #5th line
   [sg.MLine(font=('IPA Gothic', 14), size=(100,30), key='-OUTPUT-'),]
]

#Get window
window = sg.Window('Easy OCR', layout)

files = []

a = 0

while True:
   event, values = window.read()
   if event == None:
      break

   if event == 'Read execution':
      files.extend(values['-FILES-'].split(';'))
      language = 'jpn' if values['-jpn-'] else 'eng'
      text = ''
      for i in range(len(files)):
         if not i == 0:
            text += '================================================================================================\n'
         text += scan_file_to_str(files[i], language)
         if language == 'jpn':
            text = text.replace(' ', '')
         text += '\n\n'
      window.FindElement('-OUTPUT-').Update(text)
      sg.Popup('Has completed')

window.close()

Let me read two images

[English 1st (from The White House Building)] スクリーンショット 2020-05-06 22.47.50.png

[2nd English] スクリーンショット 2020-05-06 22.48.00.png ☟

【result】スクリーンショット 2020-05-06 22.59.50.png

I think English is quick to read and has a high degree of accuracy.

[Japanese (from Aozora Bunko)]

☟

【result】スクリーンショット 2020-05-06 22.58.45.png

Japanese takes time. Still, the accuracy is at a level that seems to be usable.

■ Finally

Actually, I wanted to make this app an executable file that runs on the desktop of Mac or Windows, but neither pyinstaller nor py2app worked, so I decided to write an article in this state. If I can do that in the future, I will update it.

Also, if you have any suggestions, opinions, or suggestions such as "Isn't it different here?" Or "There is such a way here," please feel free to write in the comment section.

[PYTHON] I tried to make an OCR application with PySimpleGUI

■ Background

■ Environment

■ Library installation

■ Source code (completed)

■ Finally