Read text in images with python OCR

Installation of tesseract

$ brew install tesseract

Install the library that runs tessetac

$ pip3 install pyocr

Japanese reading settings

$ curl -L -o /usr/local/share/tessdata/jpn.traineddata 'https://github.com/tesseract-ocr/tessdata/raw/master/jpn.traineddata'
$ tesseract --list-langs

List of available languages (4):
eng
jpn
osd
snum

OCR implementation

from PIL import Image
import sys
import pyocr
import pyocr.builders

tools = pyocr.get_available_tools()
if len(tools) == 0:
    print("No OCR tool found")
    sys.exit(1)
# The tools are returned in the recommended order of usage
tool = tools[0]

txt = tool.image_to_string(
    Image.open('{path}'),
    lang="jpn",
    builder=pyocr.builders.TextBuilder(tesseract_layout=6)
)
print(txt)

Recommended Posts

Read text in images with python OCR
Number recognition in images with Python
GOTO in Python with Sublime Text 3
Extract text from images in Python
Read files in parallel with Python
Working with DICOM images in Python
[Python] Read images with OpenCV (for beginners)
Clustering text in Python
Read DXF in python
Text processing in Python
Convert PDFs to images in bulk with Python
Read table data in PDF file with Python
UTF8 text processing in python
Read csv with python pandas
Bordering images with python Part 1
Base64 encoding images in Python 3
Scraping with selenium in Python
Working with LibreOffice in Python
Debugging with pdb in Python
[Python] Get the numbers in the graph image with OCR
OCR from PDF in Python
Read Euler's formula in Python
Working with sounds in Python
Scraping with Tor in Python
Read Namespace-specified XML in Python
Tweet with image in Python
Combined with permutations in Python
Read Fortran output in python
Text extraction (Read API) with Azure Computer Vision API (Python3.6)
Read json data with python
I tried [scraping] fashion images and text sentences in Python.
[Internal_math (1)] Read with Green Coder AtCoder Library ~ Implementation in Python ~
Pixel manipulation of images in Python
Testing with random numbers in Python
Working with LibreOffice in Python: import
Scraping with Selenium in Python (Basic)
How to collect images in Python
CSS parsing with cssutils in Python
Text extraction with AWS Textract (Python3.6)
Numer0n with items made in Python
Read PNG chunks in Python (class)
Generating multilingual text images using Python
Text mining with Python ① Morphological analysis
Enable Python raw_input with Sublime Text 3
Use rospy with virtualenv in Python3
Post multiple Twitter images with python
[python] Read information with Redmine API
Sort large text files in Python
Animate multiple still images with Python
Use Python in pyenv with NeoVim
Load gif images with Python + OpenCV
Heatmap with Dendrogram in Python + matplotlib
Speak Japanese text with OpenJTalk + python
Password generation in texto with python
[Python] Collect images easily with icrawler!
Use OpenCV with Python 3 in Window
Until dealing with python in Atom
Reading and writing text in Python
Read fbx from python with cinema4d
Get started with Python in Blender
Create and read messagepacks in Python