OCR in Java (character recognition from images)

things to do

Get text from images using OSS tess4j

Maven Copy and paste from mvnrepository to POM.xml

<dependency>
    <groupId>net.sourceforge.tess4j</groupId>
    <artifactId>tess4j</artifactId>
    <version>4.3.1</version>
</dependency>

tess4j-4.3.1.jar is downloaded キャプチャ.PNG

If Maven cannot be used from here

Japanese recognition file

Get the Japanese recognition file (jpn.traineddata) from GitHub repository

Source

`OcrTrial.java`


import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
import javax.imageio.ImageIO;
import net.sourceforge.tess4j.ITesseract;
import net.sourceforge.tess4j.Tesseract;
import net.sourceforge.tess4j.TesseractException;

public class OcrTrial {
	public static void main(String[] args) throws IOException, TesseractException {
		//Load image
		File file = new File("C:\\work\\INPUT.JPG");
		BufferedImage img = ImageIO.read(file);

		ITesseract tesseract = new Tesseract();
		tesseract.setDatapath("C:\\work"); //Language file (jpn.traineddata)))
		tesseract.setLanguage("jpn"); //Specify "Japanese" as the analysis language

		//analysis
		String str = tesseract.doOCR(img);

		//result
		System.out.println(str);
	}
}

Image file set to INPUT

Output result

キャプチャ.JPG

Summary

This is the mistake 〇 (pictogram) × (Pivot Gram)

The recognition rate seems to be high if the image can be clearly identified as characters.

next time

-[] Try various images

[ ] grayscale -[] Class Tesseract Understand and use the functions

Recommended Posts

OCR in Java (character recognition from images)

Correct the character code in Java and read from the URL

Guess the character code in Java

[Java] Remove whitespace from character strings

Study Deep Learning from scratch in Java.

Call Java method from JavaScript executed in Java

Reverse Key from Value in Java Map

Using JavaScript from Java in Rhino 2021 version

Get history from Zabbix server in Java

Call Visual Recognition in Watson Java SDK

GetInstance () from a @Singleton class in Groovy from Java

Partization in Java

Java method call from RPG (method call in own class)

Changes in Java 11

Rock-paper-scissors in Java

How to get Class from Element in Java

Text extraction in Java from PDF with pdfbox-2.0.8

Capture and save from selenium installation in Java

Get unixtime (seconds) from ZonedDateTime in Scala / Java

[Deep Learning from scratch] in Java 3. Neural network

Java character code

Pi in Java

Generate OffsetDateTime from Clock and LocalDateTime in Java

FizzBuzz in Java

[Java] Get KFunction from Method / Constructor in Java [Kotlin]

Try calling synchronized methods from multiple threads in Java

Delete All from Java SDK in Azure Cosmos DB

[Java] How to erase a specific character from a character string

Reverse Enum constants from strings and values in Java

Change the storage quality of JPEG images in Java

Call a program written in Swift from Processing (Java)

About full-width ⇔ half-width conversion of character strings in Java

[java] sort in list

Read JSON in Java

Interpreter implementation in Java

Call Java from JRuby

Rock-paper-scissors app in Java

Constraint programming in Java

Put java8 in centos7

Changes from Java 8 to Java 11

Sum from Java_1 to 100

NVL-ish guy in Java

Combine arrays in Java

"Hello World" in Java

Callable Interface in Java

Comments in Java source

Eval Java source from Java

Azure functions in java

Format XML in Java

Simple htmlspecialchars in Java

Boyer-Moore implementation in Java

Hello World in Java

Access API.AI from Java

Use OpenCV in Java

webApi memorandum in java

Type determination in Java

Ping commands in Java

Various threads in java

From Java to Ruby !!

Heapsort implementation (in java)

Zabbix API in Java