[PYTHON] [DanceDanceRevolution] Is it possible to predict the difficulty level (foot) from the value of the groove radar?

DanceDanceRevolution [^ DDR] is one of the music games developed by KONAMI. DanceDanceRevolution has a difficulty level for each musical score [^ level], which shows how difficult it is to play that musical score.

Apart from that, there is a mechanism called groove radar, which shows the tendency of the musical score. Each element is as follows.

STREAM
Average density . The higher the number of objects in the song, the higher the value.
VOLTAGE
highest density . The more objects there are, the higher the number of objects in 4 beats.
AIR
Jump frequency . The more objects you should not step on or step on at the same time, the higher the price.
FREEZE
constraint . The more beats you keep stepping on some panel, the higher it gets.
CHAOS
irregularity . The more fine rhythms and shifts, the higher the price.

Groove radar numbers can be calculated and calculated exactly from the score itself. The formula has not been published, but it has been revealed with considerable accuracy by volunteer players.

On the other hand, the difficulty value is artificially determined by the production side. Therefore, the difficulty level may be reviewed at the timing of version upgrade.

Then, is it possible to estimate the difficulty level from the numerical value of the groove radar? let's do it.

[^ DDR]: It's long and I want to omit it, but if I omit it with Qiita, I feel that it will be an obstacle for people who want to check memory, so I will not omit it. [^ level]: Since it was once indicated by a foot icon, it is written as “foot 16”.

environment

  • Python3 + JupyterLab
    • Matplotlib
    • NumPy
    • Pandas
    • PyCM
    • SciPy
    • Seaborn

Preparation

Import etc.

import math
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from pycm import ConfusionMatrix
from scipy.optimize import minimize, differential_evolution, Bounds
def ustd_coefficient(n):
    try:
        return math.sqrt(n / 2) * math.gamma((n - 1) / 2) / math.gamma(n / 2)
    except OverflowError:
        return math.sqrt(n / (n - 1.5))

def std_u(a):
    return np.std(a) * ustd_coefficient(len(a))

oo = math.inf

Read the data of each musical score. As data, the data of the old song and the new song of DanceDanceRevolution A20 from the BEMANI wiki at that time was set to CSV. Place it at here. This time, the training data that uses the old song for fitting and the new song will be the evaluation data. Now, let's read it and make it a DataFrame.

old_csv = Path('./old.csv')
new_csv = Path('./new.csv')
train_df = pd.read_csv(old_csv)
test_df = pd.read_csv(new_csv)
display(train_df)
display(test_df)
VERSION MUSIC SEQUENCE LEVEL STREAM VOLTAGE AIR FREEZE CHAOS
0 DanceDanceRevolution A Words of love BEGINNER 3 21 22 7 26 0
1 DanceDanceRevolution A Words of love BASIC 5 34 22 18 26 0
2 DanceDanceRevolution A Words of love DIFFICULT 7 43 34 23 26 7
3 DanceDanceRevolution A Words of love EXPERT 11 63 45 21 25 28
4 DanceDanceRevolution A Amanojaku BEGINNER 3 20 25 0 0 0
... ... ... ... ... ... ... ... ... ...
3390 DanceDanceRevolution 1st PARANOiA EXPERT 11 67 52 25 0 17
3391 DanceDanceRevolution 1st TRIP MACHINE BEGINNER 3 25 26 5 0 0
3392 DanceDanceRevolution 1st TRIP MACHINE BASIC 8 47 40 14 0 4
3393 DanceDanceRevolution 1st TRIP MACHINE DIFFICULT 9 52 40 30 0 7
3394 DanceDanceRevolution 1st TRIP MACHINE EXPERT 10 56 53 36 0 12
3395 rows × 9 columns
VERSION MUSIC SEQUENCE LEVEL STREAM VOLTAGE AIR FREEZE CHAOS
0 DanceDanceRevolution A20 Okay! Lovely! Sweetie! Darling! BEGINNER 3 18 21 5 16 0
1 DanceDanceRevolution A20 Okay! Lovely! Sweetie! Darling! BASIC 7 37 28 18 39 0
2 DanceDanceRevolution A20 Okay! Lovely! Sweetie! Darling! DIFFICULT 12 60 56 54 55 21
3 DanceDanceRevolution A20 Okay! Lovely! Sweetie! Darling! EXPERT 15 95 99 30 25 100
4 DanceDanceRevolution A20 Revolution Passionate BEGINNER 3 16 16 1 35 0
... ... ... ... ... ... ... ... ... ...
380 DanceDanceRevolution A20 50th Memorial Songs -The BEMANI History- EXPERT 13 63 79 14 62 63
381 DanceDanceRevolution A20 50th Memorial Songs -When we are two ~ under the cherry bl... BEGINNER 3 17 20 3 46 0
382 DanceDanceRevolution A20 50th Memorial Songs -When we are two ~ under the cherry bl... BASIC 7 40 33 36 29 0
383 DanceDanceRevolution A20 50th Memorial Songs -When we are two ~ under the cherry bl... DIFFICULT 9 50 46 47 3 6
384 DanceDanceRevolution A20 50th Memorial Songs -When we are two ~ under the cherry bl... EXPERT 12 73 60 60 15 32
385 rows × 9 columns

Furthermore, we will standardize the numerical values of each groove radar. Make sure that the average of the training data is 0 and the standard deviation is 1, and the same operation is performed for the evaluation data.

grs = ['STREAM', 'VOLTAGE', 'AIR', 'FREEZE', 'CHAOS']
sgrs = ['S_{}'.format(gr) for gr in grs]

m = {}
s = {}
for gr, sgr in zip(grs, sgrs):
    v = train_df.loc[:, gr].values
    v_t = test_df.loc[:, gr].values
    m[gr] = np.mean(v)
    s[gr] = std_u(v)
    train_df[sgr] = (v - m[gr]) / s[gr]
    test_df[sgr] = (v_t - m[gr]) / s[gr]

display(train_df.loc[:, sgrs])
display(test_df.loc[:, sgrs])
S_STREAM S_VOLTAGE S_AIR S_FREEZE S_CHAOS
0 -0.981448 -0.838977 -0.636332 0.056063 -0.661167
1 -0.534364 -0.838977 -0.160513 0.056063 -0.661167
2 -0.224844 -0.405051 0.055768 0.056063 -0.441192
3 0.462978 -0.007285 -0.030744 0.014296 0.218735
4 -1.015839 -0.730495 -0.939125 -1.029883 -0.661167
... ... ... ... ... ...
3390 0.600542 0.245838 0.142280 -1.029883 -0.126941
3391 -0.843883 -0.694335 -0.722844 -1.029883 -0.661167
3392 -0.087279 -0.188088 -0.333538 -1.029883 -0.535467
3393 0.084676 -0.188088 0.358562 -1.029883 -0.441192
3394 0.222240 0.281999 0.618099 -1.029883 -0.284066
3395 rows × 5 columns
S_STREAM S_VOLTAGE S_AIR S_FREEZE S_CHAOS
0 -1.08462 -0.87514 -0.72284 -0.36161 -0.66117
1 -0.43119 -0.62201 -0.16051 0.599036 -0.66117
2 0.359805 0.39048 1.396711 1.26731 -0.00124
3 1.563493 1.945381 0.358562 0.014296 2.481343
4 -1.1534 -1.05594 -0.89587 0.431967 -0.66117
... ... ... ... ... ...
380 0.462978 1.222171 -0.33354 1.55968 1.318614
381 -1.11901 -0.9113 -0.80936 0.891406 -0.66117
382 -0.32802 -0.44121 0.618099 0.181364 -0.66117
383 0.015894 0.028875 1.093917 -0.90458 -0.47262
384 0.806889 0.535122 1.656248 -0.40338 0.344436
385 rows × 5 columns

Then, extract the 2nd floor tensor that shows the groove radar of each musical score and the 1st floor tensor that shows the difficulty level of each musical score.

train_sgr_arr = train_df.loc[:, sgrs].values
test_sgr_arr = test_df.loc[:, sgrs].values
train_level_arr = train_df.loc[:, 'LEVEL'].values
test_level_arr = test_df.loc[:, 'LEVEL'].values

Multiple regression analysis by least squares method

Multiple regression analysis is based on the following concept.

There is an explanatory variable group $ x_n $ and an objective variable $ y . In this case, the explanatory variables are the values of the groove radar. The objective variable is the difficulty level. At this time, $ y' = k_0 + k_1x_1 + k_2x_2 + \cdots + k_nx_n $$ Considering the coefficient groups $ k_n $ and $ y'$, the square error of $ m $ data $ e ^ 2: = \ sum_ {i = 1} ^ m \ left (y'_i --y_i \ right) ) $ K_n $ will be searched so that ^ 2 $ is the smallest. This time, we will find such an optimization problem by the differential evolution method using SciPy.

First, define the function you want to minimize. It's $ e ^ 2 $.

def hadprosum(a, b):
    return (a * b).sum(axis=1)

def estimate(x, sgr_arr):
    x_const = x[0]
    x_coef = x[1:]
    return hadprosum(sgr_arr, x_coef) + x_const

def sqerr(x):
    est = estimate(x, train_sgr_arr)
    return ((est - train_level_arr) ** 2).sum()

Give this to SciPy's differential_evolution function. As for the search range, I give it a range that seems to be sufficient while trying various things.

bounds = Bounds([0.] * 6, [10.] * 6)
result = differential_evolution(sqerr, bounds, seed=300)
print(result)
     fun: 5170.056057917698
     jac: array([-0.00236469,  0.14933903,  0.15834303,  0.07094059,  0.01737135,
        0.1551598 ])
 message: 'Optimization terminated successfully.'
    nfev: 3546
     nit: 37
 success: True
       x: array([8.04447683, 2.64586828, 0.58686288, 0.42785461, 0.45934494,
       0.4635763 ])

Looking at this result, it seems that STREAM has the most influence, followed by VOLTAGE, CHAOS, FREEZE, AIR.

Now, let's evaluate using the parameters actually obtained.

First, define a function for prediction. Returns a tensor of the predicted difficulty that gives this function a parameter and a tensor of groove radar.

def pred1(x, sgr_arr):
    est = estimate(x, sgr_arr).clip(1., 19.)
    return np.round(est).astype(int)

A confusion matrix object is created by giving the return value of this function and the actual difficulty to PyCM's ConfusionMatrix. Access the property of this and find the correct answer rate and macro F value.

train_pred1_arr = pred1(result.x, train_sgr_arr)
test_pred1_arr = pred1(result.x, test_sgr_arr)

train_cm1 = ConfusionMatrix(train_level_arr, train_pred1_arr)
test_cm1 = ConfusionMatrix(test_level_arr, test_pred1_arr)

print('====================')
print('Train Score')
print('  Accuracy: {}'.format(train_cm1.Overall_ACC))
print('  Fmeasure: {}'.format(train_cm1.F1_Macro))
print('====================')
print('Test Score')
print('  Accuracy: {}'.format(test_cm1.Overall_ACC))
print('  Fmeasure: {}'.format(test_cm1.F1_Macro))
print('====================')
====================
Train Score
  Accuracy: 0.33431516936671574
  Fmeasure: 0.2785969345790368
====================
Test Score
  Accuracy: 0.3142857142857143
  Fmeasure: 0.24415916194348555
====================

The correct answer rate was 31.4%. This is a much lower result than I expected. Let's heatmap the confusion matrix with Seaborn.

plt.figure(figsize=(10, 10), dpi=200)
sns.heatmap(pd.DataFrame(test_cm1.table), annot=True, square=True, cmap='Blues')
plt.show()

01.png

The horizontal axis is the actual difficulty level, and the vertical axis is the predicted difficulty level. Looking at this, it seems that songs with a low difficulty level are high, those with a certain level of difficulty are evaluated low, and those with a higher difficulty level are overestimated. The heat map is not a straight line but bends inward to draw a bow shape.

Maximum likelihood estimation method (order) Logistic regression analysis

Then try logistic regression analysis. Logistic regression analysis is suitable for analysis that takes some probability as the objective variable. Consider the following formula. $ y' = \frac{1}{1 + \exp\left[-\left(k_0 + k_1x_1 + k_2x_2 + \cdots + k_nx_n\right)\right]} $ This $ y'$ takes a value from 0 to 1. When the objective variable $ y $ is determined by 0 or 1 in $ m $ data, the likelihood $ l = \ prod_ {i = 1} ^ m y_iy_i'+ (1 --y_i) (1 --y_i' ) Consider maximizing $. It is difficult to understand if you write it in a mathematical formula, but $ y'$ is the probability of being positive, and if $ y $ is 1 that is positive, then $ y'$ as it is, and if $ y $ is 0, that is, negative. Probability, that is, use the value obtained by subtracting $ y'$ from 1. If you multiply $ y'$, which is the interval of $ (0, 1) $, the value will become too small and difficult to handle on a computer, so log-likelihood $ \ log l = \ sum_ {i = 1} ^ m \ log \ left [y_iy_i'+ (1 --y_i) (1 --y_i') \ right] It is normal to maximize $.

This time we're not dealing with negative or positive, but where in the ordered class they belong. In such a case, assume multiple logistic curves that differ only in the constant term $ k_0 $, and consider them as “probability of difficulty level 2 or higher”, “probability of difficulty level 3 or higher”, ..., respectively. The "probability of difficulty level 2" is the "probability of difficulty level 2 or higher" minus the "probability of difficulty level 3 or higher", so the likelihood can be calculated from this.

First, convert the difficulty level to a one-hot format 2nd floor tensor for calculation.

train_level_onehot_arr = np.zeros(train_level_arr.shape + (19,))
for i, l in np.ndenumerate(train_level_arr):
    train_level_onehot_arr[i, l - 1] = 1.

Then give the function to be minimized. Since it is a minimization, we define the above log-likelihood with a minus.

def upperscore(x, sgr_arr):
    x_const = np.append(np.append(oo, x[:18].copy()), -oo) #Insert infinity at both ends for probability 1 above 1 and probability 0 greater than 20
    x_coef = x[18:]
    var = np.asarray([hadprosum(sgr_arr, x_coef)]).T
    cons = np.asarray([x_const])
    return 1 / (1 + np.exp(-(var + cons)))

def score(x, sgr_arr):
    us = upperscore(x, sgr_arr)
    us_2 = np.roll(us, -1)
    return np.delete(us - us_2, -1, axis=1) #Shift and pull,Remove the end to get the probability of each difficulty

def mloglh(x):
    sc = score(x, train_sgr_arr)
    ret = -(np.log((sc * train_level_onehot_arr).sum(axis=1).clip(1e-323, oo)).sum())
    return ret

Perform a search. Please note that it will take considerably longer than before.

bounds = Bounds([-60.] * 18 + [0] * 5, [20.] * 18 + [10] * 5)
result = differential_evolution(mloglh, bounds, seed=300)
print(result)
     fun: 4116.792196474322
     jac: array([ 0.00272848,  0.00636646, -0.00090949,  0.00327418, -0.00563887,
       -0.00291038, -0.00509317,  0.00045475,  0.00800355,  0.00536602,
       -0.00673026,  0.00536602,  0.00782165, -0.01209628,  0.00154614,
       -0.0003638 ,  0.00218279,  0.00582077,  0.04783942,  0.03237801,
        0.01400622,  0.00682121,  0.03601599])
 message: 'Optimization terminated successfully.'
    nfev: 218922
     nit: 625
 success: True
       x: array([ 14.33053717,  12.20158703,   9.97549255,   8.1718939 ,
         6.36190483,   4.58724228,   2.61478521,   0.66474105,
        -1.46625252,  -3.60065138,  -6.27127806,  -9.65032254,
       -14.06390123, -18.287351  , -23.44011235, -28.39033479,
       -32.35825176, -43.38390248,   6.13059504,   2.01974223,
         0.64631137,   0.67555403,   2.44873606])

This time, STREAM> CHAOS> VOLTAGE> FREEZE> AIR, and you can see that CHAOS has a greater influence.

Let's actually evaluate this as well.

def pred2(x, sgr_arr):
    sc = score(x, sgr_arr)
    return np.argmax(sc, axis=1) + 1

train_pred2_arr = pred2(result.x, train_sgr_arr)
test_pred2_arr = pred2(result.x, test_sgr_arr)

train_cm2 = ConfusionMatrix(train_level_arr, train_pred2_arr)
test_cm2 = ConfusionMatrix(test_level_arr, test_pred2_arr)

print('====================')
print('Train Score')
print('  Accuracy: {}'.format(train_cm2.Overall_ACC))
print('  Fmeasure: {}'.format(train_cm2.F1_Macro))
print('====================')
print('Test Score')
print('  Accuracy: {}'.format(test_cm2.Overall_ACC))
print('  Fmeasure: {}'.format(test_cm2.F1_Macro))
print('====================')
====================
Train Score
  Accuracy: 0.4960235640648012
  Fmeasure: 0.48246495009640167
====================
Test Score
  Accuracy: 0.5454545454545454
  Fmeasure: 0.5121482282311358
====================

This time, the correct answer rate was 54.5%. It's a lot better than before, but it's still a long way off.

plt.figure(figsize=(10, 10), dpi=200)
sns.heatmap(pd.DataFrame(test_cm2.table), annot=True, square=True, cmap='Blues')
plt.show()

02.png

This is generally on a straight line, but the low and high difficulty levels are still not good.

Conclusion

The bottom line is, "You can get that kind of value, but it's not practical." I made this attempt with the hope that it could be used as a reference for adding difficulty to the score of StepMania, but in the end it seems that it will be necessary to play and adjust.

Another thing is that in logistic regression, each constant term is naturally constrained by the magnitude relation, but in this code, depending on the random number, the constraint may not be met, and an abnormal value may result in a convergence test [^ bound]. ]. SciPy's optimization function can give constraints with inequalities, but it didn't work for me because I got an error like a constraint of unknown form was passed. Search time is also wasted, so if anyone can solve it, I would like to ask a professor.

[^ bound]: I encountered such a phenomenon once at the stage of adjusting the actual range.

Recommended Posts