I tried to discriminate a 6-digit number with a number discrimination application made with python

ff.PNG

Nice to meet you. My name is dev.

This is my first time posting to Qiita. I am writing an article in the hope that this post will be useful to someone.

This time, I would like to introduce a number discrimination application that uses OCR.

About the app

It's very simple, but it's a web application that determines the input image by OCR (Google cloud vision api) and returns the answer.

Reason for creation

It would be convenient if you could read an analog meter like a car speedometer. It seems that the service has already been provided, but it all started when I thought, "I want to do something a little closer."

But reading the meter seems to be difficult. Since this is the first app, it is a challenge to identify the numbers by saying "Let's read the analog mileage numbers first!".

Basic functions

If you select an image of 6-digit numbers and click it, the read numbers will be returned to the Web screen.

For example, you can read such an image. sample1.png sample2.png

What I felt when I made it was Using "Google cloud vision api", you can easily create a high-precision app like this! That's right.

It's easy, but the accuracy is GOOD (le)!

Moreover, not only numbers but also letters can be judged.

So this is also OK

図1.png あ1.png

But what can it be used for?

It's a play app, so you can't use it for anything as it is.

As an advanced form, I think it can also be used for "reading the serial number of document NO" and "reading slips".

You can realize the OCR function with "Tesseract" and other free software without using "Google cloud vision api".

Implementation environment

html css Flask

Choices

To read the numbers, I considered the following two things. ・ Learning data of mnist ・ OCR

Train using mnist dataset

For mnist, it's relatively easy to train. However, in order to express the number of digits, it is necessary to detect objects such as the first and second digits.

The training data can be saved in the following ways. ■ [Reference]

Sample code for mnist learning


from keras.datasets import mnist
from keras.models import Sequential, load_model
from keras.layers.core import Dense, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D, Dropout, Reshape
from keras.utils import np_utils
import numpy as np
(X_train, y_train),(X_test, y_test) = mnist.load_data()
X_train = np.array(X_train)/255
X_test = np.array(X_test)/255
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)
model = Sequential()
model.add(Reshape((28,28,1),input_shape=(28,28)))
model.add(Conv2D(32,(3,3)))
model.add(Activation("relu"))
model.add(Conv2D(32,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.5))
model.add(Conv2D(16,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(784))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation("softmax"))
model.compile(loss="categorical_crossentropy", optimizer="sgd", metrics=["accuracy"])
hist = model.fit(X_train, y_train, batch_size=200,
                 verbose=1, epochs=1, validation_split=0.1)
score = model.evaluate(X_test, y_test, verbose=1)
print('Test loss:', score[0])
print("test accuracy:", score[1])
model.save("C:/test/mnist_main.h5")

Use OCR to determine

OCR is the quickest way to do it. Since it uses Google's API, it is highly accurate and does not need to be created. It reads without worrying about the number of digits.

You can use it enough depending on the purpose!

Accuracy

I checked a 1000-character number to check the accuracy. ![1000.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/969670/e4dba485-359e-b3d8-c1e5-e5c1c32ff511.png)

■ Conditions Image size: 1,024x768 Font: Yu Gothic Font size: 16x23pixel (WH) Character spacing: 5 pixels Line spacing: 11pix3l

■ Results Accuracy: 100%

I was able to get high accuracy, so next I tried changing the font under the same conditions.

■ Results msp gothic: 100% msp Mincho: 100% fugaz one:100% ink free:99.9% np b:99.6%

All are highly accurate, but the "np b" font is less accurate than the others. Why?

np-b.png

The cause is in the form of "1". There were places that were recognized as "| (pipe)" and "I (eye)".

■ Other 1 ink free is a handwritten character like the one below, but the accuracy was as high as 99.9%, so it may be compatible with standard fonts. ink_free.png

■ Other 2 I also tried the following characters with 1 pixel spacing and line spacing, but the result was 100%.

Yup. You can use it enough!

There is a lot of information on the web about accuracy, so let's search for it.


Articles that I referred to
(https://qiita.com/saken649/items/4bfd215bf943c36a52ab "Differences in character identification by images") (https://qiita.com/se_fy/items/963b295bbd13101c044b "Throughput by image size")

About Google Cloud Vision API

The setting itself is very easy. You can use it if you get the API key. * Please note that if you do not enable the billing settings, an error will be returned and you will not be able to use it.

The price itself seems to be quite cheap. Free up to 1,000 times a month (unit). After that, 1.5 $ for every 1,000 units. The price changes according to the number of times range.

It seems that you can operate it with just pocket money. However, let's enable the alert setting of the usage fee just in case. (accident prevention)

Future development

I think it would be useful to be able to implement a function to read the document number according to the business. I will try to learn by looking at my free time.

Recommended Posts

I tried to discriminate a 6-digit number with a number discrimination application made with python
I tried to make a 2channel post notification application with Python
I tried to make a todo application using bottle with python
I made a GUI application with Python + PyQt5
I tried to draw a route map with Python
I tried to automatically generate a password with Python3
I tried to solve AOJ's number theory with Python
I tried to make a simple mail sending application with tkinter of Python
[ES Lab] I tried to develop a WEB application with Python and Flask ②
I made a fortune with Python.
I made a daemon with Python
I made a package to filter time series with python
I made a simple book application with python + Flask ~ Introduction ~
I made a character counter with Python
I tried a functional language with Python
I made a Hex map with Python
I made a roguelike game with Python
I made a simple blackjack with Python
I made a configuration file with Python
I made a WEB application with Django
I made a neuron simulator with Python
[5th] I tried to make a certain authenticator-like tool with python
I made a library to easily read config files with Python
[2nd] I tried to make a certain authenticator-like tool with python
[3rd] I tried to make a certain authenticator-like tool with python
[Python] A memo that I tried to get started with asyncio
I tried to create a list of prime numbers with python
[4th] I tried to make a certain authenticator-like tool with python
[1st] I tried to make a certain authenticator-like tool with python
I made a server with Python socket and ssl and tried to access it from a browser
Python: I tried to make a flat / flat_map just right with a generator
I made a competitive programming glossary with Python
I made a weather forecast bot-like with Python.
I made a web application in Python that converts Markdown to HTML
I tried to create a program to convert hexadecimal numbers to decimal numbers with python
Mayungo's Python Learning Episode 6: I tried to convert a character string to a number
I made a Twitter fujoshi blocker with Python ①
I want to make a game with Python
[Python memo] I want to get a 2-digit hexadecimal number from a decimal number
[Django] I made a field to enter the date with 4 digit numbers
[Python] I made a Youtube Downloader with Tkinter.
I tried to get CloudWatch data with Python
I tried to output LLVM IR with Python
I tried to make a traffic light-like with Raspberry Pi 4 (Python edition)
I tried to automate sushi making with python
[Outlook] I tried to automatically create a daily report email with Python
I tried to build a Mac Python development environment with pythonz + direnv
I made a random number graph with Numpy
I want to write to a file with Python
I made a bin picking game with Python
I made a Mattermost bot with Python (+ Flask)
I tried to make a periodical process with CentOS7, Selenium, Python and Chrome
[Patent analysis] I tried to make a patent map with Python without spending money
When I tried to create a virtual environment with Python, it didn't work
I tried to easily create a fully automatic attendance system with Selenium + Python
A story that I was addicted to when I made SFTP communication with python
I tried to create a table only with Django
I made a prime number generation program in Python
I made a Christmas tree lighting game with Python
I tried to get started with blender python script_Part 01
I tried to touch the CSV file with Python