[PYTHON] [Report] I tried cvusk's "Judgment of authorship in Aozora Bunko".

I was wandering in the dark.

"I want to do deep learning." With that in mind, it is often said that [Deep Learning from scratch](https://www.amazon.co.jp/%E3%82%BC%E3%83%AD%E3%81%8B%E3 % 82% 89% E4% BD% 9C% E3% 82% 8BDeep-Learning-% E2% 80% 95Python% E3% 81% A7% E5% AD% A6% E3% 81% B6% E3% 83% 87% E3% 82% A3% E3% 83% BC% E3% 83% 97% E3% 83% A9% E3% 83% BC% E3% 83% 8B% E3% 83% B3% E3% 82% B0% E3% 81% AE% E7% 90% 86% E8% AB% 96% E3% 81% A8% E5% AE% 9F% E8% A3% 85-% E6% 96% 8E% E8% 97% A4-% E5% I read BA% B7% E6% AF% 85 / dp / 4873117585). I felt like I understood somehow, I tried playing with the code in the book, but it doesn't work. When I was about to give up, the light came on.

Judgment of author-likeness in Aozora Bunko (KERAS + character-level cnn)

I tried to learn

The crawling of Aozora Bunko is over, I ran the article aozora_cnn.py. I have up to epoch100, but it took me 3 days to finish all the learning.

The epoch11 model file seemed to be the highest value.

Checkpoints are set.
By checkpoint
If the accuracy at the time of verification is the highest value,
/tmp/Under the weight of the model at that time*.Save as an h5df file.
Epoch 11/100
378700/378774 [============================>.] - ETA: 0s - loss: 0.1420 - acc: 0.9449Epoch 00010: val_acc improved from 0.87362 to 0.89609, saving model to /tmp/weights.10-0.14-0.94-0.41-0.90.hdf5
378774/378774 [==============================] - 2298s - loss: 0.1420 - acc: 0.9449 - val_loss: 0.4083 - val_acc: 0.8961

I tried to judge

Since the model file was not saved after epoch11, I tried running aozora_classification.py by specifying the /tmp/weight-*.h5df file created by the checkpoint of epoch11.

The character string to be judged is also in sample.

Atsushi Nakajima Let's judge the beginning of "The Moon Over the Mountains".

~Longxi's Li Zhi was a scholarly talent, in the last years of Tenpo, and was named after the tiger at a young age, and was then supplemented by Lieutenant Gangnam. I didn't know.~

Result is,ยทยทยท

Natsume Soseki Ryunosuke Akutagawa Ogai Mori Ango Sakaguchi
0 1.056089e-09 1.293081e-07 0.000033

It looks like Ogai Mori! It looks like it's done! (It's natural because I haven't changed the code at all: hugging_face :: hugging_face :: hugging_face :)

"I woke up if it was a dream, but we haven't done anything yet. Go ahead."

I tried to judge. It looks like Natsume Soseki.

Natsume Soseki Ryunosuke Akutagawa Ogai Mori Ango Sakaguchi
0 0.125387 0.000199 6.651750e-07

Recommended Posts

[Report] I tried cvusk's "Judgment of authorship in Aozora Bunko".
I tried the accuracy of three Stirling's approximations in python
I tried to implement blackjack of card game in Python
I tried to make an analysis base of 5 patterns in 3 years
I tried using GrabCut of OpenCV
I tried running GAN in Colaboratory
I tried Line notification in Python
I tried various patterns of date strings to be entered in pandas.to_datetime
I tried to display the altitude value of DTM in a graph
I tried to implement a card game of playing cards in Python
I tried touching touch related methods in the scene module of pythonista
I tried to implement PLSA in Python
I tried to implement permutation in Python
I tried to implement PLSA in Python 2
I tried using Bayesian Optimization in Python
I tried putting virtualenv in Cygwin environment
I tried to implement PPO in Python
I tried 3D detection of a car
[Azure] I tried to create a Linux virtual machine in Azure of Microsoft Learn