[PYTHON] I tried to check with the help of neural networks whether "Japanese" only "unreadable fonts" can really be read only by Japanese.

I will confirm the subject matter because it is a good study material that I was a little worried about.

Origin

I heard on Twitter Japanese "only" unreadable fonts !? I thought it was stupid, but I couldn't read it because it seemed to be really readable! (Http://togetter.com/li/887973)

** "Does the readable and unreadable (= indistinguishable) change depending on the presence or absence of the recognition function for katakana (or a shape close to it)?" **

That's why I decided to check it using a convolutional neural network. Someone may have already tried it, but it's one thing because it's studying.

The explanation of the convolutional neural network is omitted here. I don't know.

hypothesis

A classifier that can only classify English is more accurate than a classifier that can classify katakana + English.

Method

By comparing the following 5 patterns, we will evaluate the effect of the difference of known characters on the accuracy.

No. Training data Number of output classes
1 Alphabet + numbers only 62 (case sensitive)
2 No.1 + Hiragana only 62+48 = 110
3 No.1 + Katakana only 62+48 = 110
4 No.1 + Hiragana + Katakana 62+48*2 = 158
5 No.1 + Hebrew letters 62+28 = 89

For No.5, see http://dic.nicovideo.jp/a/%E3%83%98%E3%83%96%E3%83%A9%E3%82%A4%E6%96%87%E5% The "letter" and "end form" described in AD% 97 are treated as separate classes. (In the first place, for Tsukkomi who said "Hebrew letters", the reason is that I added it for comparison because there was a comment in the togetter "I feel like it is a bit like Hebrew letters.")

Usage data

I thought I'd use free data for OCR, but I didn't find the right one, so I made it myself.

With the script described in Generate a lot of single-character images with Pillow (PIL) (http://qiita.com/lazykyama/items/65bcce351f3d1cf07d8e) I made an image of only one character for the number of target characters, and added random noise to it to inflate the data. In this verification, 30 images with noise per character are dynamically generated and used as learning data.

Also, the test image used was manually cut out from the sample image at http://www.dafont.com/electroharmonix.font.

Both training data and test data use 64x64 grayscale images.

Network structure

The configuration of the classifier is as shown in the code below. Various parameters are appropriate.

self.__model = chainer.FunctionSet(
     conv1=F.Convolution2D(1, 16, 5), 
     conv2=F.Convolution2D(16, 16, 5), 
     l3=F.Linear(784, 784), 
     softmax4=F.Linear(784, class_num))

#Omission

x = chainer.Variable(source)

h = F.max_pooling_2d(F.relu(self.__model.conv1(x)), ksize=2, stride=2)
h = F.max_pooling_2d(F.relu(self.__model.conv2(h)), ksize=3, stride=4)
h = F.dropout(F.relu(self.__model.l3(h)), train=train)
y = self.__model.softmax4(h)

As you can see, we are using Chainer for this implementation. (Thank you for referring to http://qiita.com/hogefugabar/items/312707a09d29632e7288.)

Experimental result

This experiment uses the following parameters for all patterns. (I'm trying various other things)

The results under these conditions are as follows.

No. Training data Correct answer rate
1 Alphabet + numbers only 46.2% (12 / 26)
2 No.1 + Hiragana only 34.6% ( 9 / 26)
3 No.1 + Katakana only 15.4% ( 4 / 26)
4 No.1 + Hiragana + Katakana 26.9% ( 7 / 26)
5 No.1 + Hebrew letters 23.1% ( 6 / 26)

In the first place, the percentage of correct answers is not very high, but even so, the accuracy when mixed with katakana is terrible. .. .. After all it may be difficult for Japanese people to read.

Finally, the log at the time of the test is described below. (In the log, "capsV" etc. indicates that it is in uppercase. The electroharmonix targeted this time seems to have the same shape in both uppercase and lowercase, so it is treated as the correct answer regardless of which one matches. )

test models.
[PATTERN1]: English only.
2015-10-19 08:08:39,112 [INFO] #data: 26
2015-10-19 08:08:39,186 [INFO] correct: v, answer: capsV => RIGHT
2015-10-19 08:08:39,186 [INFO] correct: g, answer: capsT => WRONG
2015-10-19 08:08:39,186 [INFO] correct: s, answer: capsE => WRONG
2015-10-19 08:08:39,187 [INFO] correct: o, answer: capsD => WRONG
2015-10-19 08:08:39,187 [INFO] correct: i, answer: capsZ => WRONG
2015-10-19 08:08:39,187 [INFO] correct: r, answer: capsR => RIGHT
2015-10-19 08:08:39,187 [INFO] correct: u, answer: capsU => RIGHT
2015-10-19 08:08:39,187 [INFO] correct: a, answer: capsA => RIGHT
2015-10-19 08:08:39,187 [INFO] correct: h, answer: capsH => RIGHT
2015-10-19 08:08:39,188 [INFO] correct: l, answer: capsD => WRONG
2015-10-19 08:08:39,188 [INFO] correct: k, answer: capsN => WRONG
2015-10-19 08:08:39,188 [INFO] correct: n, answer: capsO => WRONG
2015-10-19 08:08:39,188 [INFO] correct: q, answer: capsT => WRONG
2015-10-19 08:08:39,188 [INFO] correct: d, answer: capsD => RIGHT
2015-10-19 08:08:39,188 [INFO] correct: m, answer: capsM => RIGHT
2015-10-19 08:08:39,189 [INFO] correct: t, answer: n => WRONG
2015-10-19 08:08:39,189 [INFO] correct: e, answer: capsE => RIGHT
2015-10-19 08:08:39,189 [INFO] correct: x, answer: x => RIGHT
2015-10-19 08:08:39,189 [INFO] correct: p, answer: capsF => WRONG
2015-10-19 08:08:39,189 [INFO] correct: w, answer: capsQ => WRONG
2015-10-19 08:08:39,189 [INFO] correct: z, answer: capsZ => RIGHT
2015-10-19 08:08:39,189 [INFO] correct: y, answer: capsY => RIGHT
2015-10-19 08:08:39,189 [INFO] correct: c, answer: capsT => WRONG
2015-10-19 08:08:39,190 [INFO] correct: b, answer: capsZ => WRONG
2015-10-19 08:08:39,190 [INFO] correct: f, answer: capsF => RIGHT
2015-10-19 08:08:39,190 [INFO] correct: j, answer: capsZ => WRONG
2015-10-19 08:08:39,190 [INFO] test accuracy: 0.461538461538 (12 / 26)
[PATTERN2]: English and hiragana.
2015-10-19 08:08:39,314 [INFO] #data: 26
2015-10-19 08:08:39,387 [INFO] correct: v, answer: capsV => RIGHT
2015-10-19 08:08:39,387 [INFO] correct: g, answer: capsT => WRONG
2015-10-19 08:08:39,387 [INFO] correct: s, answer: capsB => WRONG
2015-10-19 08:08:39,387 [INFO] correct: o, answer: capsB => WRONG
2015-10-19 08:08:39,388 [INFO] correct: i, answer: capsZ => WRONG
2015-10-19 08:08:39,388 [INFO] correct: r, answer:Ho=> WRONG
2015-10-19 08:08:39,388 [INFO] correct: u, answer: capsU => RIGHT
2015-10-19 08:08:39,388 [INFO] correct: a, answer: l => WRONG
2015-10-19 08:08:39,388 [INFO] correct: h, answer:Wow=> WRONG
2015-10-19 08:08:39,388 [INFO] correct: l, answer: capsU => WRONG
2015-10-19 08:08:39,388 [INFO] correct: k, answer: capsT => WRONG
2015-10-19 08:08:39,389 [INFO] correct: n, answer: capsD => WRONG
2015-10-19 08:08:39,389 [INFO] correct: q, answer: capsT => WRONG
2015-10-19 08:08:39,389 [INFO] correct: d, answer: capsD => RIGHT
2015-10-19 08:08:39,389 [INFO] correct: m, answer: capsM => RIGHT
2015-10-19 08:08:39,389 [INFO] correct: t, answer:Su=> WRONG
2015-10-19 08:08:39,389 [INFO] correct: e, answer: capsE => RIGHT
2015-10-19 08:08:39,390 [INFO] correct: x, answer: x => RIGHT
2015-10-19 08:08:39,390 [INFO] correct: p, answer: capsT => WRONG
2015-10-19 08:08:39,390 [INFO] correct: w, answer:I=> WRONG
2015-10-19 08:08:39,390 [INFO] correct: z, answer: capsZ => RIGHT
2015-10-19 08:08:39,390 [INFO] correct: y, answer: capsY => RIGHT
2015-10-19 08:08:39,390 [INFO] correct: c, answer: capsE => WRONG
2015-10-19 08:08:39,391 [INFO] correct: b, answer: capsT => WRONG
2015-10-19 08:08:39,392 [INFO] correct: f, answer: capsT => WRONG
2015-10-19 08:08:39,392 [INFO] correct: j, answer: capsJ => RIGHT
2015-10-19 08:08:39,392 [INFO] test accuracy: 0.346153846154 (9 / 26)
[PATTERN3]: English and katakana.
2015-10-19 08:08:39,517 [INFO] #data: 26
2015-10-19 08:08:39,591 [INFO] correct: v, answer: capsV => RIGHT
2015-10-19 08:08:39,591 [INFO] correct: g, answer: capsQ => WRONG
2015-10-19 08:08:39,591 [INFO] correct: s, answer:La=> WRONG
2015-10-19 08:08:39,591 [INFO] correct: o, answer:Wow=> WRONG
2015-10-19 08:08:39,591 [INFO] correct: i, answer:ヱ=> WRONG
2015-10-19 08:08:39,592 [INFO] correct: r, answer:Wow=> WRONG
2015-10-19 08:08:39,592 [INFO] correct: u, answer: capsU => RIGHT
2015-10-19 08:08:39,592 [INFO] correct: a, answer:Mu=> WRONG
2015-10-19 08:08:39,592 [INFO] correct: h, answer: r => WRONG
2015-10-19 08:08:39,592 [INFO] correct: l, answer: capsQ => WRONG
2015-10-19 08:08:39,592 [INFO] correct: k, answer:Wow=> WRONG
2015-10-19 08:08:39,592 [INFO] correct: n, answer:Wow=> WRONG
2015-10-19 08:08:39,593 [INFO] correct: q, answer: capsR => WRONG
2015-10-19 08:08:39,593 [INFO] correct: d, answer:Wow=> WRONG
2015-10-19 08:08:39,593 [INFO] correct: m, answer: capsO => WRONG
2015-10-19 08:08:39,593 [INFO] correct: t, answer: l => WRONG
2015-10-19 08:08:39,593 [INFO] correct: e, answer:La=> WRONG
2015-10-19 08:08:39,593 [INFO] correct: x, answer:Me=> WRONG
2015-10-19 08:08:39,593 [INFO] correct: p, answer:A=> WRONG
2015-10-19 08:08:39,593 [INFO] correct: w, answer: capsQ => WRONG
2015-10-19 08:08:39,593 [INFO] correct: z, answer: capsZ => RIGHT
2015-10-19 08:08:39,593 [INFO] correct: y, answer: capsY => RIGHT
2015-10-19 08:08:39,594 [INFO] correct: c, answer: capsE => WRONG
2015-10-19 08:08:39,594 [INFO] correct: b, answer:Mosquito=> WRONG
2015-10-19 08:08:39,594 [INFO] correct: f, answer: 5 => WRONG
2015-10-19 08:08:39,594 [INFO] correct: j, answer: capsT => WRONG
2015-10-19 08:08:39,594 [INFO] test accuracy: 0.153846153846 (4 / 26)
[PATTERN4]: English and hiragana and katakana.
2015-10-19 08:08:39,718 [INFO] #data: 26
2015-10-19 08:08:39,792 [INFO] correct: v, answer:No=> WRONG
2015-10-19 08:08:39,792 [INFO] correct: g, answer: capsQ => WRONG
2015-10-19 08:08:39,792 [INFO] correct: s, answer:La=> WRONG
2015-10-19 08:08:39,793 [INFO] correct: o, answer: capsQ => WRONG
2015-10-19 08:08:39,793 [INFO] correct: i, answer: capsT => WRONG
2015-10-19 08:08:39,793 [INFO] correct: r, answer:A=> WRONG
2015-10-19 08:08:39,793 [INFO] correct: u, answer: capsU => RIGHT
2015-10-19 08:08:39,793 [INFO] correct: a, answer:Mu=> WRONG
2015-10-19 08:08:39,793 [INFO] correct: h, answer:Wow=> WRONG
2015-10-19 08:08:39,793 [INFO] correct: l, answer:Ri=> WRONG
2015-10-19 08:08:39,794 [INFO] correct: k, answer:Wow=> WRONG
2015-10-19 08:08:39,794 [INFO] correct: n, answer:Wow=> WRONG
2015-10-19 08:08:39,794 [INFO] correct: q, answer: capsQ => RIGHT
2015-10-19 08:08:39,794 [INFO] correct: d, answer: capsD => RIGHT
2015-10-19 08:08:39,795 [INFO] correct: m, answer: capsM => RIGHT
2015-10-19 08:08:39,795 [INFO] correct: t, answer:Na=> WRONG
2015-10-19 08:08:39,795 [INFO] correct: e, answer: capsE => RIGHT
2015-10-19 08:08:39,795 [INFO] correct: x, answer:Me=> WRONG
2015-10-19 08:08:39,795 [INFO] correct: p, answer:A=> WRONG
2015-10-19 08:08:39,796 [INFO] correct: w, answer:B=> WRONG
2015-10-19 08:08:39,796 [INFO] correct: z, answer: capsZ => RIGHT
2015-10-19 08:08:39,796 [INFO] correct: y, answer:No=> WRONG
2015-10-19 08:08:39,796 [INFO] correct: c, answer: capsQ => WRONG
2015-10-19 08:08:39,796 [INFO] correct: b, answer: capsZ => WRONG
2015-10-19 08:08:39,796 [INFO] correct: f, answer:Te=> WRONG
2015-10-19 08:08:39,796 [INFO] correct: j, answer: capsJ => RIGHT
2015-10-19 08:08:39,796 [INFO] test accuracy: 0.269230769231 (7 / 26)
[PATTERN5]: English and Hebrew.
2015-10-19 08:08:39,921 [INFO] #data: 26
2015-10-19 08:08:39,994 [INFO] correct: v, answer: capsV => RIGHT
2015-10-19 08:08:39,995 [INFO] correct: g, answer: ם => WRONG
2015-10-19 08:08:39,995 [INFO] correct: s, answer: capsZ => WRONG
2015-10-19 08:08:39,995 [INFO] correct: o, answer: capsH => WRONG
2015-10-19 08:08:39,995 [INFO] correct: i, answer: capsZ => WRONG
2015-10-19 08:08:39,995 [INFO] correct: r, answer: capsK => WRONG
2015-10-19 08:08:39,995 [INFO] correct: u, answer: capsH => WRONG
2015-10-19 08:08:39,996 [INFO] correct: a, answer: capsA => RIGHT
2015-10-19 08:08:39,996 [INFO] correct: h, answer: b => WRONG
2015-10-19 08:08:39,996 [INFO] correct: l, answer: ם => WRONG
2015-10-19 08:08:39,996 [INFO] correct: k, answer: ל => WRONG
2015-10-19 08:08:39,996 [INFO] correct: n, answer: capsH => WRONG
2015-10-19 08:08:39,997 [INFO] correct: q, answer: capsH => WRONG
2015-10-19 08:08:39,997 [INFO] correct: d, answer: capsD => RIGHT
2015-10-19 08:08:39,997 [INFO] correct: m, answer: capsM => RIGHT
2015-10-19 08:08:39,997 [INFO] correct: t, answer: l => WRONG
2015-10-19 08:08:39,997 [INFO] correct: e, answer: z => WRONG
2015-10-19 08:08:39,997 [INFO] correct: x, answer: capsX => RIGHT
2015-10-19 08:08:39,997 [INFO] correct: p, answer: capsT => WRONG
2015-10-19 08:08:39,998 [INFO] correct: w, answer: capsQ => WRONG
2015-10-19 08:08:39,998 [INFO] correct: z, answer: capsZ => RIGHT
2015-10-19 08:08:39,998 [INFO] correct: y, answer: capsV => WRONG
2015-10-19 08:08:39,998 [INFO] correct: c, answer: capsQ => WRONG
2015-10-19 08:08:39,998 [INFO] correct: b, answer: capsM => WRONG
2015-10-19 08:08:39,998 [INFO] correct: f, answer: capsM => WRONG
2015-10-19 08:08:39,998 [INFO] correct: j, answer: 5 => WRONG
2015-10-19 08:08:39,999 [INFO] test accuracy: 0.230769230769 (6 / 26)

Even in the English-only model, "g" is mistaken for "T", so I can't say anything about it. electroharmonics_g_img.png ↑ "g" of electroharmonix

Arial_capsT_img.png ↑ "T" of learning data

However, if it is an English + Katakana model, it is useless because "s" is mistaken for "la" ... electroharmonics_s_img.png ↑ "s" of electroharmonix

Osaka_ラ_img.png ↑ "La" of learning data

… Not really. This seems to be wrong.

Summary

Postscript

I forgot to write some, so I will add it.

Remarks

Leave the set of sauce. https://gist.github.com/lazykyama/f586419cd72d5312288e

Recommended Posts

I tried to check with the help of neural networks whether "Japanese" only "unreadable fonts" can really be read only by Japanese.
I tried to confirm whether the unbiased estimator of standard deviation is really unbiased by "throwing a coin 10,000 times"
I tried to find the entropy of the image with python
I tried to find the average of the sequence with TensorFlow
I tried to predict the sales of game software with VARISTA by referring to the article of Codexa
I tried to expand the database so that it can be used with PES analysis software
I tried to verify the speaker identification by the Speaker Recognition API of Azure Cognitive Services with Python. # 1
I tried to verify the speaker identification by the Speaker Recognition API of Azure Cognitive Services with Python. # 2
I tried to automate the watering of the planter with Raspberry Pi
I tried to expand the size of the logical volume with LVM
I want to check the position of my face with OpenCV!
I tried to verify how fast the mnist of Chainer example can be speeded up using cython
I tried to open the latest data of the Excel file managed by date in the folder with Python
I don't want to admit it ... The dynamical representation of Neural Networks
I tried how to improve the accuracy of my own Neural Network
765 I tried to identify the three professional families by CNN (with Chainer 2.0.0)
I tried to get the authentication code of Qiita API with Python.
I tried to automatically extract the movements of PES players with software
I tried using "Streamlit" which can do the Web only with Python
I tried to verify and analyze the acceleration of Python by Cython
I tried to analyze the negativeness of Nono Morikubo. [Compare with Posipa]
I tried to streamline the standard role of new employees with Python
I tried to visualize the text of the novel "Weathering with You" with WordCloud
I tried to verify the result of A / B test by chi-square test
I tried to predict the behavior of the new coronavirus with the SEIR model.
I tried to compare the accuracy of Japanese BERT and Japanese Distil BERT sentence classification with PyTorch & Introduction of BERT accuracy improvement technique
I tried to predict the number of people infected with coronavirus in Japan by the method of the latest paper in China
Python> set> Convert with set ()> dictionary is only key> I was taught how to convert the values of dictionary to set / dir ({}) / help ({}) / help ({} .values)
I tried to understand the learning function of neural networks carefully without using a machine learning library (first half).
I tried to predict the horses that will be in the top 3 with LightGBM
I tried to summarize the operations that are likely to be used with numpy-stl
I tried to easily visualize the tweets of JAWS DAYS 2017 with Python + ELK
I tried to predict the presence or absence of snow by machine learning.
I tried to rescue the data of the laptop by booting it on Ubuntu
The story of making soracom_exporter (I tried to monitor SORACOM Air with Prometheus)
I tried to create a model with the sample of Amazon SageMaker Autopilot
I tried to automatically send the literature of the new coronavirus to LINE with Python
I tried to make the weather forecast on the official line by referring to the weather forecast bot of "Dialogue system made with python".
I tried to save the data with discord
I tried to touch the API of ebay
I tried to correct the keystone of the image
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
I tried to make something like a chatbot with the Seq2Seq model of TensorFlow
[First data science ⑤] I tried to help my friend find the first property by data analysis.
I tried to put out the frequent word ranking of LINE talk with Python
I tried to automate the article update of Livedoor blog with Python and selenium.
I tried to visualize the characteristics of new coronavirus infected person information with wordcloud
I tried to compare the processing speed with dplyr of R and pandas of Python
The 15th offline real-time I tried to solve the problem of how to write with python
You can use assert and Enum (or) decorators to check compliance with type annotation constraints without the help of mypy.
I tried to learn the sin function with chainer
I tried to create a table only with Django
I tried to extract features with SIFT of OpenCV
I tried to read and save automatically with VOICEROID2 2
I tried to summarize the basic form of GPLVM
I tried to touch the CSV file with Python
I tried to solve the soma cube with python
I tried to automatically read and save with VOICEROID2
I tried to visualize the spacha information of VTuber
I tried to erase the negative part of Meros