[PYTHON] Pokemon machine learning Nth decoction

Introduction

I wanted to do machine learning, so I chose "Pokemon". Since the race value of Pokemon is fixed for each Pokemon, I thought it was a jewel box of data.

However, this time it is a fully backward compatible </ b> article that is based on the one that came out at the top by searching for "Pokemon Machine Learning" on Google, so if you want to imitate it, please refer to the original article. Please give me. Machine learning with Pokemon

environment

OS:Win10 home IDE:VScode Language: python 3.7.3 64bit

What i did

Based on Pokemon database up to 7 generations, Pokemon of "Flying" and "Esper" are extracted and binarized by logistic regression. I tried it. By the way, the number of each type in Pokemon is as follows (up to 7 generations)

type Number of animals
normal 116 animals
Fighting 63 animals
Doku 69 animals
Jimen 75 animals
flight 113 animals
insect 89 animals
Iwa 67 animals
ghost 55 animals
Steel 58 animals
Fire 72 animals
Mizu 141 animals
Denki 60 animals
Kusa 103 animals
Ice 43 animals
Esper 100 animals
Dragon 59 animals
Evil 59 animals
Fairy 54 animals

Water was the most and ice was the smallest. It's freeze-dried and doubled.

Code is below.

lr_pokemon.py



import pandas as pd
import codecs
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt


# read data by pandas
with codecs.open("data/pokemon_status.csv", "r", "Shift-JIS", "ignore") as file:
    df = pd.read_table(file, delimiter=",")

# print(df.head(15))

p_type = ["normal","Fighting","Doku","Jimen","flight","insect","Iwa","ghost","Steel","Fire","Mizu","Denki","Kusa","Ice","Esper","Dragon","Evil","Fairy"]
print(len(p_type))

# make functions
def count_type(p_type):
    list1 = df[df['Type 1'] == p_type]
    list2 = df[df['Type 2'] == p_type]
    lists = pd.concat([list1, list2])
    print(p_type + "Pokemon: %d animals" % len(lists))

def type_to_num(p_type):
    if p_type == "flight":
        return 1
    else:
        return 0

# count number of type in pokemons
for i in p_type:
    count_type(i)

# make sky_df
sky1 = df[df['Type 1'] == "flight"]
sky2 = df[df['Type 2'] == "flight"]
sky = pd.concat([sky1, sky2])

# make psycho_df
psycho1 = df[df['Type 1'] == "Esper"]
psycho2 = df[df['Type 2'] == "Esper"]
psycho = pd.concat([psycho1, psycho2])

df_s_p = pd.concat([sky, psycho], ignore_index=True)

type1 = df_s_p['Type 1'].apply(type_to_num)
type2 = df_s_p['Type 2'].apply(type_to_num)
df_s_p['type_num'] = type1 + type2

print(df_s_p)

X = df_s_p.iloc[:,7:13].values
y = df_s_p['type_num'].values

X_train,X_test,y_train,y_test = train_test_split(X, y, test_size = 0.3, random_state = 0)

lr = LogisticRegression(C = 1.0)
lr.fit(X_train, y_train)
# show scores
print("train_score: %.3f" % lr.score(X_train, y_train))
print("test_score: %.3f" % lr.score(X_test, y_test))

i = 0
error1 = 0
success1 = 0
error2 = 0
success2 = 0
print("[List of Pokemon judged to be flying type]")
print("----------------------------------------")
print("")
while i < len(df_s_p):
    y_pred = lr.predict(X[i].reshape(1, -1))
    if y_pred == 1:
        print(df_s_p.loc[i, ["Pokemon name"]])
        if df_s_p.loc[i, ["type_num"]].values == 1:
            success1 += 1
            print("It ’s a flying type, is n’t it?")
            print("")
        else:
            error1 += 1
            print("I thought it was a flying type")
            print("")
    else:
        print(df_s_p.loc[i, ["Pokemon name"]])
        if df_s_p.loc[i, ["type_num"]].values == 0:
            error2 += 1
            print("It ’s an Esper type, is n’t it?")
            print("")
        else:
            success2 += 1
            print("I thought it was an Esper type")
            print("")
    i += 1
print("----------------------------------------")
print("Number of Pokemon judged to be the correct flying type: %d animals" % success1)
print("Number of Pokemon correctly judged to be Esper type: %d animals" % success2)
print("Number of Pokemon that were mistakenly judged to be flying type: %d animals" % error1)
print("Number of Pokemon that were mistakenly identified as Esper type: %d animals" % error2)
print("")




result

The result was a correct answer rate of 75%. It's low. It was a number that could not be used in machine learning.

I thought I could get better numbers. Because I thought that "flying" could be roughly divided into physical attackers and "esper" could be roughly divided into special attackers. The reality is not that simple. However, when I actually saw a Pokemon that was falsely detected, I got the reason that it wouldn't be falsely detected. For example, there were "Thunder" and "Freezer" as children who were mistaken for Esper even though they were flying, but that's right because they are expensive. Even I would make a mistake at first sight. On the other hand, there were "Abra" and "Ralts" as children who were mistaken for being an Esper, but I thought it couldn't be helped because these are low race values. Because it is difficult to make a difference in the numerical value in the low race value range. The evolutionary "Hoodin", "Gardevoir", and "Gallade" were allotted to Bakko Esper, so I'm relieved.




Hmm? ??


![erureido.jpg](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/208060/08320895-7b06-30f7-e6c0-c5a2080413b6.jpeg)

Gallade Han! I wonder if you could have been mistaken for a flying type! !! !!

in conclusion

After all it is wrong to judge Pokemon only by race value.

Recommended Posts

Pokemon machine learning Nth decoction
Machine learning learned with Pokemon
Machine learning
[Memo] Machine learning
Machine learning classification
Machine Learning sample
Machine learning tutorial summary
About machine learning overfitting
Machine learning ⑤ AdaBoost Summary
Machine Learning: Supervised --AdaBoost
Machine learning logistic regression
Studying Machine Learning ~ matplotlib ~
Machine learning linear regression
Machine learning course memo
Machine learning library dlib
Machine learning library Shogun
Machine learning rabbit challenge
Introduction to machine learning
Machine Learning: k-Nearest Neighbors
What is machine learning?
Machine learning model considering maintainability
Data set for machine learning
Japanese preprocessing for machine learning
Machine learning in Delemas (practice)
An introduction to machine learning
Machine learning / classification related techniques
Machine Learning: Supervised --Linear Regression
Basics of Machine Learning (Notes)
Machine learning beginners tried RBM
[Machine learning] Understanding random forest
Machine learning with Python! Preparation
Machine Learning Study Resource Notepad
Machine learning ② Naive Bayes Summary
Understand machine learning ~ ridge regression ~.
Machine learning article summary (self-authored)
About machine learning mixed matrices
Machine Learning: Supervised --Random Forest
Practical machine learning system memo
Machine learning Minesweeper with PyTorch
Machine learning environment construction macbook 2021
Python Machine Learning Programming> Keywords
Machine learning algorithm (simple perceptron)
Used in machine learning EDA
Importance of machine learning datasets
Machine learning and mathematical optimization
Machine Learning: Supervised --Support Vector Machine
Supervised machine learning (classification / regression)
I implemented Extreme learning machine
Super introduction to machine learning
4 [/] Four Arithmetic by Machine Learning
Try machine learning with Kaggle
Generate Pokemon with Deep Learning
Machine learning stacking template (regression)
Machine Learning: Supervised --Decision Tree
Machine learning algorithm (logistic regression)
<Course> Machine Learning Chapter 6: Algorithm 2 (k-means)
Introduction to machine learning Note writing
Significance of machine learning and mini-batch learning
[Machine learning] Try studying decision trees
[Machine learning] Understanding uncorrelatedness from mathematics
Machine learning algorithm (support vector machine application)