[PYTHON] I asked Deep Learning if recent Pokemon are like Digimon.

Introduction

This article is for people who somehow know about Pokemon (you don't need to know Digimon)

When I helped analyze the questionnaire compiled by Uncle Takasa Pokemon around March, 155 of the 6353 free-form responses to the questionnaire commented on Digimon.

"It's indistinguishable from a Digimon" "It's no longer a Pokemon ... Digimon ... I want you to go back to the old design ..." "Subjective, but the legendary Pokemon has an angular impression, giving the impression of a Digimon." "I feel that the number of children with a smooth shape that seems to have no hair on the whole has increased. It looks like a Digimon" "The design is so cluttered that it is difficult for children to draw. It is becoming closer to a Digimon." "The design up to the diepa is a Pokemon-like design. After that, it has a Digimon-like atmosphere."

I have been familiar with Pokemon since I was a child, but since I had little connection with Digimon, ** When I hear Digimon, I have a strong image of Butter-Fly, which my friend sang in karaoke, and "Our War Game!" Directed by Mamoru Hosoda. I don't know much about Digimon, but Apparently ** Recent Pokemon are like Digimon **

** Answers that mention "Digimon" about Pokemon design ** https://docs.google.com/spreadsheets/d/1FZFh8Mfa3-CU-qeFMGtZ9xByxbJ9wMj2mSEJDzV-tiQ/edit#gid=1753266897

** Articles about "Pokemon and Digimon" and Digimon by Uncle Takasa Pokemon ** https://pkmnheight.blogspot.com/2020/07/7.html


What is Pokemon?

If you don't know, please see the following article. There is no article that has been investigated in detail so far. If you don't understand, it's probably okay if you think Pikachu or Pokemon GO is a Pokemon.

Pokemon-ness-3_ Design History-First Development History http://pkmnheight.blogspot.com/2020/04/301.html

What is a Digimon?

** Digimon ** is an abbreviation for Digital Monster, It is a product group of Bandai's mobile training game that first appeared in 1997, one year after the birth of Pokemon.

Bandai's ** Tamagotchi **, which was a big hit in 1996, is expanded not only to nurturing elements but also to communication and battle. While Pokemon games were mainly sold on Nintendo handheld game consoles, Digimon games are mainly stationary game consoles.


While Pokemon monsters and Digimon monsters are the same in that they designed fictitious monsters with the motif of real creatures and things, ** Pokemon and Digimon seem to have different design tastes **. ..

-An example of Pokemon design

--An example of Digimon design

The pattern and the color used. The two are very different, such as the number of heads of the character and the complexity of the design. So what does ** "Recent Pokemon look like Digimon" ** </ font> mean?

I thought about such an experiment.


experimental method

"Hey ... this is the current Pokemon ~. Wow! The guy who doesn't know at all! !! !! Pokemon has changed ~~~~. </ font>

I used to play Pokemon, but You only know gold and silver. </ font>

** Isn't the current Pokemon like a Digimon? ?? ?? ** "

In order to reproduce the perception of those who say, the following experiments will be conducted.

** Experiment: After learning "Old Pokemon" up to Pokemon Gold and Silver released in 1999 and a classifier at random for Digimon, Pokemon from Ruby Sapphire to the latest work and the remaining Digimon are classified, and the result Aggregate. Repeat this 2000 times and count the misclassification rate **

Digital images of Pokemon used in the experiment are obtained from ** Pokemon Book ** (official). https://zukan.pokemon.co.jp/ The old official image of Pokemon in a watercolor style is borrowed from a volunteer site overseas. As for the Digimon image, I got the one that was uploaded until 2020.3.10 in ** Digimon Encyclopedia ** (official). https://digimon.net/reference/

Image used

--Pokemon images up to gold and silver: 504 in total (current official image 251 + old watercolor official image 253)
(Random 392 sheets for learning)
By handling the current official image and the old watercolor official image, it is expected to have the effect of strengthening the feelings of the classifier for Pokemon.

image.png

--Digimon images: 957 official images (402 randomly for learning)
Some Digimon were excluded because some of them had the same pose but only the colors changed, and some images had a background written on them. --Exclusion example 1: Exclude one of "Agumon (black)" with the same design but different colors (center in the figure below) --Exclusion example 2: "Kuzuhamon shrine maiden mode" with the background written (right in the figure below) image.png

  • for test --Pokemon: 968 (804 since Rubisafa + 168 not used for learning) (All mega sinker region forms, etc. shall be treated as after Rubisafa) (A new Pokemon was added in Pokemon Sword Shield "Island of Armor" on June 17, 2020, but it is not included because it is troublesome ~~) --Digimon: The remaining 555 that were not used for learning / verification

These learning and testing will be shuffled each time to create a classifier with different Pokemon-like and Digimon-like qualities.

--Image size
Width: Convert to 224 pixels
Vertical: Convert to 224 pixels
Number of channels: 3

The images used for these experiments have been uploaded to github https://github.com/mrok273/Qiita/tree/master/%E3%83%9D%E3%82%B1%E3%83%A2%E3%83%B3/poke_vs_digi/data

--Model used
VGG16 that has been pre-learned with ImageNet is set so that it can be classified into Pokemon and Digimon by changing the output size of the final layer to "2".


Image illustration of data collection

I asked my wife to draw it.


Aggregate results

Correct answer rate

The correct answer rate of Digimon is 90% The correct answer rate for Pokemon was 82%.

About 51% of Digimon have never been mistaken for Pokemon. On the other hand, about 18% of Pokemon were never mistaken for Digimon. Apparently it's more difficult to classify Pokemon as Pokemon.

The csv file of the aggregation result has been uploaded to github https://github.com/mrok273/Qiita/tree/master/%E3%83%9D%E3%82%B1%E3%83%A2%E3%83%B3/poke_vs_digi


Recent Pokemon are like Digimon!

The following is a summary of the misclassification rates for Digimon by generation.

Certainly, the later generations have a higher misclassification rate for Digimon.

Apparently

"Recent Pokemon are rather ** Digimon **" "What is it now? It looks like a ** Digimon **" "I don't even think it's different from ** Digimon ** anymore." </ Font>

It seems that the opinion is ** not saying without any grounds ** And ** "4,7,8" generation seems to be especially Digimon-like. ** **


Which of the recent Pokemon is like a Digimon?

- Digimon </ span> -like Digimon </ span> Example

- Digimon </ span> -like Pokemon </ span>


- Pokemon </ span> -like Digimon </ span>

- Pokemon </ span> -like Pokemon </ span> Example (3rd generation or later)

Looking at the classification results, the following characteristics can be seen in ** Pokemon-like ** and ** Digimon-like **.

--Digimon --Black --Thorny design

  • complexity --Vivid shades --Many drawings
  • Pokémon
  • pastel colour --Round
  • simple

In particular, looking at Digimon-like groups, it is easier to classify them as Digimon as the ** design with higher "information amount in design" (hereinafter "information amount") ** based on the article I posted earlier Recognize. In fact, there seems to be an overwhelming difference in the amount of information between Pokemon and Digimon.

Posted earlier, thanks to Uncle Pokemon Takasa Petit Buzz </ font> article. ** "Did the design of recent Pokemon become complicated? [Python] [OpenCV]" ** https://qiita.com/mrok273/items/6f0bcdc62b6184f79308

Pokemon with little information Pokemon with a lot of information

Doesn't AI feel "Digimon-like" just because the design is complicated? ??

The figure below shows the amount of information and the tendency of misclassification rates for Pokemon and Digimon. (The red line is the regression line for logistic regression by sns.regplot)

The more complex the design of Pokemon, the higher the misclassification rate for Digimon tends to be.

On the other hand, the simpler the design of Digimon, the higher the misclassification rate for Pokemon.

However, ** The amount of information seems to be one of the factors that make it Digimon-like, but it seems that it is not the only deciding factor. ** **

Where was the AI looking? Visualization # 1. Visualization of contribution areas by Grad-CAM

The following pages are enriched for visualization of CNN! !! !! https://github.com/utkuozbulak/pytorch-cnn-visualizations

The model used for classification is finally converted to 4096 features by convolving it in 5 layers, so it is difficult for the human eye to know what contributes to the classification result, but By using a method called ** Grad-Cam **, it becomes easy for humans to recognize to some extent.

For example, in the image below you can visualize the area within the image that contributed to the classification as "dog". image.png

Here is the result of using GradCAM for Pokemon images.

Areas contributing to the "Digimon" classification Areas of contribution to the "Pokemon" classification

I understand that the eyes of Bulbasaur contribute to the Digimon classification, while contributing to the Pokemon classification over a wide range of other areas, but I do not know what kind of features they are looking at, so I devised the following. ..

Ingenuity: Make a Pokemon-like Pokemon and Digimon chimera to the same extent and Grad-CAM

This is Pikachu, a Pokemon classified as 100% Pokemon, ** Despite being a Digimon, it is a Terriermon that has been misclassified as a Pokemon 2000 times out of 2000 **

Pikachu(Pokémon)
Pokémonらしさ100%
Terriermon(Digimon)
Pokémonらしさ100%

Both Pikachu and Terriermon create four types of Pokemon Digimon chimera images from images with a 100% classification rate into Pokemon as follows.

Example of visualization with chimera image

For example, in the image below, four types of chimera images, a Pokemon Taillow (upper right) and a Digimon Popomon (lower right), were created, and the Pokemon classification contribution area and the Digimon contribution area were acquired by Grad-CAM, respectively. is there. This combination of Pokemon and Digimon matches those who are closest to each other with the classification rates for Pokemon being 64.25% and 63.93%, respectively.

Of the Grad-CAM results of eight chimera images ** The four on the left represent the Digimon classification contribution area, and the four on the right represent the Pokemon classification contribution area. ** ** Looking at the results of the eight chimera images in this way, it can be seen that the contribution of the ** Taillow eyes ** to the Digimon classification is higher than that of the real Digimon Popomon (lower left of the figure).

In this way, by combining Pokemon and Digimon that are similarly classified as Pokemon, ** it will not be too Digimon or too Pokemon than doing it with only one image, and the contribution area after applying GradCAM It became easier to see. ** **


Where was the AI looking? Visualization # 2. Filter visualization and intermediate layer extraction

The visualization of the filter is as follows. AI seems to process by seeing how these structures exist and are arranged in the image.

The structures that respond well to the above filters are as follows.

The filter on the left responds well to fine jaggedness such as nails, feathers and hair, and the filter on the right responds well to round structures such as eyes.

It is defined as ** "Pokemon Classification Contribution Filter" and "Digimon Classification Contribution Filter" ** depending on which of Pokemon-Digimon is easier to classify.

I used cnn_layer_visualization.py from Utku Ozbulak's github. https://github.com/utkuozbulak/pytorch-cnn-visualizations Visualized and extracted the output of the middle layer for channel 512 of the 24th filter of the VGG19 net.

The way to get the output of the middle layer in pytorch is possible by changing the forward as follows

 def forward(self,x,target = 24):
     results = []
     for i,model in enumerate(self.features):
         x = model(x)
         if i == target:#Get the output of the middle layer here
             results.append(x)
     x = self.net.avgpool(x)
     x = torch.flatten(x)
     for i,model in enumerate(self.classifiers):
         x = model(x)
         if ii == 6: #Also get the final output
             results.append(x)
     return results

Code example to get the area around the most responsive area to the filter from the output of the middle layer

"""
The 24th layer I used has a width of 28$\times$28, original image is 224$\times$Because it is 224
8 per cell$\times$You will be in charge of 8 areas,
In this code, in order to make it easier to understand the structure of the image, extra 4 pixels are acquired for both width and height.
"""


#Reading the output result of the middle layer
temp_df = pd.read_csv(layer_filename)

#Np from the original image.array creation
img_array = np.array(img_orig)

#28*Get index when 28 is flattened
target_index = int(list(temp_df.loc[filter_num].sort_values(ascending=False).index)[0]) 

row = target_index//28
col = target_index%28
row_min = max(0,row*8 - 4)
row_max = min(row*8+12,224)
col_min = max(0,col*8 - 4)
col_max = min(col*8+12,224)


#If the target area is too white, do not get it
image_test = img_array[row*8:min(row*8+8,224),col*8:min(col*8+8,224)][:,:,:3]
white_area_per = np.where(image_test.reshape(-1,3)==[255,255,255],True,False)[:,0].sum()\
                        /image_test.reshape(-1,3).shape[0]
if white_area_per < 0.95:

    #Filter output value
    attribute_value = temp_df.loc[filter_num,str(target_index)]
    #Site extraction
    image_array_part = img_array[row_min:row_max,col_min:col_max]
    #Create images with PIL
    image_part = Image.fromarray(image_array_part).resize((64,64))

Only the filter images that are significant in the classification result have been uploaded to the following github https://github.com/mrok273/Qiita/tree/master/%E3%83%9D%E3%82%B1%E3%83%A2%E3%83%B3/poke_vs_digi/hidden_layer


Pokemon classification contribution filter

Items that cannot be posted here have been uploaded to the following github https://github.com/mrok273/Qiita/tree/master/%E3%83%9D%E3%82%B1%E3%83%A2%E3%83%B3/poke_vs_digi/feature

Of the 512, 40 types of filters worked significantly for Pokemon judgment. Here are some of them that were easy to interpret. The image on the left is a visualization of the VGG filter, and the image on the right is the top 100 images that responded to this filter. (Each number is for my management) Vertical contour ・

―― 12. Claws, tips of wings, etc.

--74. Wings, antennae, etc.

--85. Curved structure

--214. Painting

--327. Contour (longitudinal curve)

Digimon Classification Contribution Filter

There are 43 types of filters that have significantly contributed to Digimon classification. I think there were many filters that were easier to interpret than the Pokemon classification contribution filters.

―― 11. Saturation

--91. Round eyes

--136 Gloss

--236. Metallic color

--160 Thorny thorns

--271. Nail

--307. Hair

Summary of differences in characteristics between Pokemon and Digimon

Thorny expression of nails, kiva, and hair

-Digimon's claws, kiva, and hair thorns

--Pokemon's claws, kiva, and hair thorns

--Pokemon's claws, kiva, and hair splinters that were mistaken for Digimon

Basically, the expression of the thorns of Pokemon's claws, kiva, and hair is often expressed in a rough zigzag. On the other hand, Digimon are characterized by sharpness, density, and dark shadows.

For AI, the first gold and silver Pokemon seemed to have relatively modest expressions of claws, kiva, and thorns on hair. Certainly, even the first Pokemon has sharp thorns on its claws and beak. Pokemon with finely drawn fur were more likely to be misclassified as Digimon.

Expression of eyes

--Digimon's eyes

--Pokemon eyes

--Pokemon's eyes misunderstood as Digimon

** Eyes with highlights have a high probability of eating Digimon judgments. ** ** ** It seems that the way of drawing eyes has changed especially from the 4th generation **, which may be the reason why the misclassification rate of 4th generation Pokemon is high. On the other hand, eyes that are susceptible to Pokemon judgment are often represented by small dots or lines.

The figure below shows how Pokemon eyes are drawn overseas, which is often posted on reddit as ** "Pokemon eye design. Now and then" **. "Recent Pokemon are no longer triangular and look like Pokemon !!!" </ font> was also seen in the questionnaire. It seems that the triangular eyes are also partly recognized as a feature of Pokemon, but as a result of looking at many Grad-CAM images, it seems that the round eyes are like Digimon in terms of AI, but the triangular eyes are that much of Pokemon. Impression that does not serve as a criterion.

Tecateka is a Digimon. The lumps are Pokemon.

--Digimon's Tecateka Expression

Digimon have many metallic textures and are characterized by the way they gradation.

--Pokemon's rugged expression

On the other hand, probably because Pokemon has a rock type, its rugged texture and structure are susceptible to Pokemon judgment.

Small face is judged as a Digimon

--Small face part of Digimon (only a small part of the huge number)

--Small face expression of Pokemon misunderstood as a Digimon

The small face was judged as a Digimon, probably because the face is a place where lines and shadows tend to concentrate. The impression that this small face expression is increasing in recent generations.

Different colors

As for the colors, there were many things that I couldn't understand just by looking at the filter, so I used opencv etc. to add up the colors of each image.

--Saturation is different The graph below summarizes the highest saturation of each Pokemon & Digimon image. It can be seen that only Digimon use distinctly different saturation.

――The color is scattered differently The graph below shows how the colors used in the image are scattered. At the same time that the colors of Digimon tend to be dispersed, it can be seen that many colors are used for Pokemon with each generation.

How to calculate how colors are scattered: Convert RGB to Lab and cluster similar colors with DBSCAN. Create a weighted composite vector from the calculated ab converted to radians with arctan, It is defined that the larger the length of the composite vector, the more concentrated the colors, and the smaller the length, the more dispersed the colors.


Experiment: Let's make Mightyena like a Pokemon!

This child is a Pokemon that was mistaken for a Digimon ** 1997 out of 2000 times. Mightyena.

Those who know Graena ** "Graena is like a Digimon ??? What are you saying ??? Aceburn is a Digimon ???" ** < You may think that it is / font>, but it seems to be an insanely Digimon in terms of AI, and it is mistaken for a Digimon more than four times as much as Aceburn. ** **

Search word Number of hits By AI
Number of misclassifications
image
"Mightyena Digimon-like" 8 cases 1997
"Aceburn Digimon-like" About 277,000 445

I asked my wife to create 22 images of this Graena's Digimon-like part that was modified little by little, and let a randomly learned classifier classify it 1000 times, and measured which part contributed to Pokemon-ness. ..

The Graena image used in the experiment has been uploaded to the following github https://github.com/mrok273/Qiita/tree/master/%E3%83%9D%E3%82%B1%E3%83%A2%E3%83%B3/poke_vs_digi/graena_change

** The result is as follows. ** **

changes image Pokémon
To the uniqueness
Contribution rate
comment
No processing ---- No processing at all.
Very close to a Digimon
Hair tip summary corner 0.8% Almost no effect
Head 7.0% The outline of the hair on the head was rounded.
On the contrary depending on learning
It can be like a Digimon
Toes 7.0% Grad-According to CAM
Responded to finger lines rather than nails
So excluded. Although the area to be changed is small
It became like a Pokemon
Triangular eyes 7.3% For the first gold and silver Pokemon
Introduced the triangular eyes that were used.
For Pokemon-ness
Not much effect
highlight
simplification
15.5% It was reacting by Digimon judgment
Removed jagged highlights.
Hair tip cut 23.8% The hair tips of the contour part were mainly rounded.
Greatly improved Pokemon-ness
Hair tip summary round 65.7% A state in which the hair bundles are further gathered.
No longer pokemon
mostDigimonMightyena
(Originator)
mostPokémonMightyena
(Hair summary circle+Triangular eyes+
highlight+toes)
Pokemon-ness1.3% Pokemon-ness94.0%

Many Pokemon fans will think that ** "Triangular eyes are the best Pokemon-like feature !!" ** </ font>. However, it is a trivial feature for ** AI with a contribution rate of 7.3%. ** **

The effect of the jaggedness of the hair was greater than that, and just putting together the tips of the hair contributed 65.7% to the Pokemon-ness.

It seems that the jagged edges of the hair had a great influence on the AI's recognition that Mightyena was a Digimon. I checked the images of all Pokemon, but it was true that no other Pokemon had such a fluffy hair. You might think, "It's a lie! There must be more!" Most of the Pokemon that seem to have fluffy hair are as mentioned in the above experiment. A large bundle is drawn in a circle like "Hair tip summary circle".

Even the hair of Oronge, a Pokemon characterized by long hair growing from the whole body, The ends of the hair are grouped together to create a rounded expression.


"Digimon-likeness" for humans and AI is a different hypothesis

Of the 155 answers that mentioned "Digimon" earlier, 34 in my counting method (22 in the counting method of Uncle Takasa Pokemon) were "when did it look like a Digimon" and "○ generation looks like a Digimon". There was a reference to. (Since there are only 34 data, please read in half the story from here) </ font> http://pkmnheight.blogspot.com/2020/07/7.html#itu

According to the deep learning classifier, Pokemon with a Digimon-like design were ** "4,7,8" generations **. On the other hand, the ones most often mentioned in the questionnaire were the ** "5,7,8" generation **. What exactly does this mean? After all, can humans and AI not understand each other?

This difference between humans and AI was hinted at in a previous article by Uncle Pokemon Takasa.

** Pokemon-ness-5_Hitogata increased? ** </ Font> http://pkmnheight.blogspot.com/2020/05/5.html

** There are many 5th generation upright Pokemon **.

All species Upright number Upright rate Rate of increase
1 generation 151 36 23.8%
2nd generation 100 28 28.0% 117.4%
3 generations 135 33 24.4% 87.3%
4th generation 107 27 25.2% 103.2%
5 generations 156 51 32.7% 129.6%
6th generation 72 20 27.8% 85.0%
7th generation 86 27 31.4% 113.0%
8th generation 89 33 37.1% 118.1%

The percentage of upright Pokemon has increased from the previous generation, Comparing the generations that were mentioned as "Digimon-like" in the questionnaire, except for the 2nd generation, the increase / decrease was almost the same as "5th generation is high", "6th generation is calm", and "7th and 8th generations are high". It matched.

Perhaps the Digimon-likeness of Pokemon in human recognition is largely due to whether or not it stands upright on two legs. This is because even those who are not familiar with Digimon know that Digimon will stand upright when they evolve (more specifically, they will look like humans and super robots). (Actually, when I checked the upright rate of Digimon only for those whose name is A, ** 50% of Digimon were upright **)

image.png

In other words, in the "recent Pokemon is like a Digimon" that humans feel

  1. ** Digimon </ span> will stand on two legs as it evolves **
  2. ** Recent Pokemon </ span> stands on two legs **
  3. ** Therefore, the recent Pokemon </ span> looks like Digimon </ span> **

** The syllogism may hold true. ** **


Summary

--Create a classifier that trains old Pokemon images from the first Gold and Silver era and random Digimon images, and classify new Pokemon and the remaining Digimon. This was repeated 2000 times and the misclassification rate was totaled. --As a result, "4,7,8 generation" Pokemon had a high misclassification rate as Digimon. In other words, ** It can be said that it is as Digimon-like as the recent Pokemon. ** ** -** Pokemon with more complicated designs are more likely to be classified as Digimon, but it seems that saturation, round eyes, small faces, etc. also affect the "Digimon-like" for AI **. , It may be a small part of the features that I could grasp. ――By combining the method of making a pair of Pokemon and Digimon with the same "Pokemon-likeness" and applying it to Grad-CAM and the method of checking the output of the intermediate layer for each filter channel, nails, paint, eyes, To some extent, I was able to understand that AI classifies AI based on several characteristics such as metallic shine. ――Deep learning thinks "Digimon-like" is different from humans' "Digimon-like", and humans may feel more like Digimon from the context of erect bipedalism.

Reflections

――I wanted to hear the opinions of those who are familiar with Digimon. Pokemon learned by distinguishing between old and new, but Digimon did not understand at all when examined, so they did it completely randomly. ――I raised some "conditions that Pokemon is mistaken for Digimon", but only a few could be interpreted. ――In order to bring the classification by AI closer to human intuition, it is necessary to be able to extract information on how upright each image is. Technologies such as pose estimation and posture recognition have been developed for images of living humans. It was difficult to do with Pokemon in the illustrations, let alone the positions and numbers of each hand, and even face recognition was difficult.