[PYTHON] Find a mathematical model of experience points required to level up DQ Walk (2)

Overview (TL; DR)

Since we finished checking the data last time, the next step is to calculate the mathematical model. (I'm currently studying, so I'd be happy if you could point out any mistakes.)

Calculate a mathematical model of experience points required to level up the DQ walk (1)

regression analysis

First, get the required library group

import pandas as pd
import numpy as np
import scipy as sp

import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import linear_model
sns.set()

%matplotlib inline
%precision 3

Import the data with read_csv ().

df = pd.read_csv('data.csv',names=['EXP'])
df['CUMSUM_EXP'] = df['EXP'].cumsum()
df.index = df.index + 1
df.head()
Screen Shot 2020-01-31 at 21.49.26.png

Feed the data to sklearn's linear_model

First of all, the experience value required to reach the next level

reg = linear_model.LinearRegression()
 
X = df.index
Y = df['EXP']
#Create a predictive model
reg.fit(X, Y)
#Regression coefficient
print(reg.coef_)
#Intercept
print(reg.intercept_)
#R2 (coefficient of determination)
print(reg.score(X, Y))

*** I get an error about the execution result of jupyter !! ***

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-54-69fa63dab1be> in <module>
      6 Y = df['EXP']
      7 #Create a predictive model
----> 8 reg.fit(X, Y)
      9 #Regression coefficient
     10 print(reg.coef_)

/usr/local/lib/python3.7/site-packages/sklearn/linear_model/base.py in fit(self, X, y, sample_weight)
    461         n_jobs_ = self.n_jobs
    462         X, y = check_X_y(X, y, accept_sparse=['csr', 'csc', 'coo'],
--> 463                          y_numeric=True, multi_output=True)
    464 
    465         if sample_weight is not None and np.atleast_1d(sample_weight).ndim > 1:

/usr/local/lib/python3.7/site-packages/sklearn/utils/validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)
    717                     ensure_min_features=ensure_min_features,
    718                     warn_on_dtype=warn_on_dtype,
--> 719                     estimator=estimator)
    720     if multi_output:
    721         y = check_array(y, 'csr', force_all_finite=True, ensure_2d=False,

/usr/local/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    519                     "Reshape your data either using array.reshape(-1, 1) if "
    520                     "your data has a single feature or array.reshape(1, -1) "
--> 521                     "if it contains a single sample.".format(array))
    522 
    523         # in the future np.flexible dtypes will be handled like object dtypes

ValueError: Expected 2D array, got 1D array instead:
array=[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
 49 50 51 52 53 54 55].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Apparently, X wants a 2D Array. Certainly, sklearn reference also says as follows. Screen Shot 2020-01-31 at 21.57.12.png

So, although it is a little rough, prepare a two-dimensional Array.

X = []
for i in range(1,56):
    X.append([i])

It's time

reg = linear_model.LinearRegression()
 
Y = df['EXP']
#Create a predictive model
reg.fit(X, Y)
#Regression coefficient
print(reg.coef_)
#Intercept
print(reg.intercept_)
#R2 (coefficient of determination)
print(reg.score(X, Y))
Screen Shot 2020-01-31 at 22.04.06.png

*** Yoshika succeeded in regression analysis! *** *** (The formula is written separately in Markdown of jupyter.)

Visualize as if breathing.

plt.plot(df.index,df['EXP'],label="EXP")
plt.plot(X,reg.predict(X),label="LinearRegression")
plt.xlabel('LEVEL')
plt.ylabel('EXP')
plt.grid(True)

image.png

This is no good ... Certainly R2 is 0.377, so it doesn't make any sense. (It's good, this is just studying.)

Next is the cumulative experience required to reach a certain level

reg2 = linear_model.LinearRegression()
 
Y2 = df['CUMSUM_EXP']
#Create a predictive model
reg2.fit(X, Y2)
#Regression coefficient
print(reg2.coef_)
#Intercept(error)
print(reg2.intercept_)
#R2 (coefficient of determination)
print(reg2.score(X, Y2))
Screen Shot 2020-01-31 at 22.03.36.png

Visualize to breathe

image.png

This is also completely useless ... Sure, R2 is 0.575, so it doesn't make any sense, albeit better than before. (It's good, this is just studying.)

Multidimensional

It is clear that the linear equation is not good, so let's make it multidimensional.

Two-dimensional

First, create an explanatory variable

D1 = []
D2 = []
for i in range(1,56):
    D1.append(i)
    D2.append(i**2)
df_x = pd.DataFrame({"D1":D1,"D2":D2})
df_x.head()
Screen Shot 2020-01-31 at 22.22.33.png

Feed the data to sklearn's linear_model

First of all, the experience value required to reach the next level

reg3 = linear_model.LinearRegression()
X3 =df_x
Y3 = df['EXP']
#Create a predictive model
reg3.fit(X3, Y3)
#Regression coefficient
print(reg3.coef_)
#Intercept(error)
print(reg3.intercept_)
#R2 (coefficient of determination)
print(reg3.score(X3, Y3))
Screen Shot 2020-01-31 at 22.25.28.png

Visualize to breathe

plt.plot(df.index,df['EXP'],label="EXP")
plt.plot(X,reg3.predict(X3),label="LinearRegression")
plt.xlabel('LEVEL')
plt.ylabel('EXP')
plt.grid(True)

image.png

R2 is 0.644, which is better than before, but it is hard to say that it is still Fit.

Next is the cumulative experience required to reach a certain level

reg4 = linear_model.LinearRegression()
X4 =df_x
Y4 = df['CUMSUM_EXP']
#Create a predictive model
reg4.fit(X4, Y4)
#Regression coefficient
print(reg4.coef_)
#Intercept(error)
print(reg4.intercept_)
#R2 (coefficient of determination)
print(reg4.score(X4, Y4))
Screen Shot 2020-01-31 at 22.28.27.png

Visualize to breathe

plt.plot(df.index,df['CUMSUM_EXP'],label="CUMSUM_EXP")
plt.plot(X,reg4.predict(X4),label="LinearRegression")
plt.xlabel('LEVEL')
plt.ylabel('EXP')
plt.grid(True)

image.png

R2 is 0.860, so it looks pretty good, isn't it?

After 3D

The program is the same, so only the graph results.

3D

image.png

4 dimensions

image.png

The cumulative one has been quite fit. R2 is also up to 0.9733.

5 dimensions

image.png

Isn't it fair to say that both are almost fit? R2 requires 0.961 experience points and 0.987 cumulative experience points to reach the next level.

And to the legend

Let's calculate the cumulative experience value required if the current maximum level 55 rises to 99 as in the original.

The calculated model formula is here Screen Shot 2020-01-31 at 23.10.47.png

If you visualize this

image.png

** You will need nearly 140 million XP (1,392,549,526) to reach level 99. (It's a prediction, a prediction.) ** By the way, the experience value required to reach level 55 is 3,441,626 (about 3.5 million), so it is necessary to reach level 55 404 times.

66,311 for metal hoimin, 132,623 for stray metal, 904,252 for metal slime. Hmmm, we need to overfish the numbers that WWF is likely to move on.

Game Over

Recommended Posts

Find a mathematical model of experience points required to level up DQ Walk (2)
Find a mathematical model of experience points required to level up DQ Walk (1)
Sum of variables in a mathematical model
I made a function to check the model of DCGAN
How to find the scaling factor of a biorthogonal wavelet
I tried to predict the number of domestically infected people of the new corona with a mathematical model