[PYTHON] Data analysis practice (data acquisition / organization / confirmation) -Dereste event 2001th border-
Overview
- I want to practice time series data analysis, so I will try it with "2001th borderline data of Dereste's event pt ranking".
- If possible, I would like to say, "Predict the 2001 line at the moment the event starts." (I have a harsh feeling)
- As a preliminary preparation, data acquisition, organization, and summary confirmation are performed on this page.
- The features are not enough, so I would like to add more and dig deeper.
- The scripts created this time are a1 ~ a4 of here.
Data used
[Imus Dereste Strategy Summary Wiki [Idolmaster Cinderella Girls Starlight Stage]](https://imascg-slstage-wiki.gamerch.com/%E3%82%A4%E3%83%99%E3%83%B3% E3% 83% 88% E3% 83% 87% E3% 83% BC% E3% 82% BF) data
Data summary
name |
meaning |
Variable scale |
Data type |
event name |
Event name |
Nominal scale |
String |
2001 Border pt |
2001 Borderline points (I want to know about this) |
Proportional scale |
Numerical value (integer) |
format |
Event format (Attapon, Groove, Carnival) |
Nominal scale |
Character string (category) |
attribute |
Event attributes (format)=(Only for Groove) |
Nominal scale |
Character string (category) |
date |
Event start date |
Interval scale |
Date type |
period |
Event length (hours) |
Proportional scale |
Numerical value (integer) |
- Number of data: 95
- Missing
- Attribute: 63 missing. Missing except for format Groove
- Period: 3 missing. Missing if the format is Carnival
- Statistic
- Event name: No duplicate
- 2001 Border: No outliers. Mean is higher than median
- Format: 3 types. Up to 60 duplicates
- Date: No duplicates
- Attributes: 3 types. Up to 11 duplicates
- Period: No outliers. Mean is higher than median
|
Event |
2001 |
format |
date |
attribute |
period |
count |
95 |
95 |
95 |
95 |
32 |
92 |
unique |
95 |
NaN |
3 |
95 |
3 |
NaN |
top |
Passion fan fanfare |
NaN |
Attapon |
2015/12/4 |
Vo |
NaN |
freq |
1 |
NaN |
60 |
1 |
11 |
NaN |
mean |
NaN |
95234.4 |
NaN |
NaN |
NaN |
183.5 |
std |
NaN |
42973.7 |
NaN |
NaN |
NaN |
19.1 |
min |
NaN |
40096 |
NaN |
NaN |
NaN |
150 |
25% |
NaN |
63761.5 |
NaN |
NaN |
NaN |
174 |
50% |
NaN |
83532 |
NaN |
NaN |
NaN |
174 |
75% |
NaN |
115178.5 |
NaN |
NaN |
NaN |
198 |
max |
NaN |
224697 |
NaN |
NaN |
NaN |
249 |
Data distribution
format
- Attapon format closes about two-thirds
- Carnival only 3 times
format |
The number of data |
Carnival |
3 |
Groove |
32 |
Attapon |
60 |
attribute
- Data format is Groove only
- Held fairly evenly
attribute |
The number of data |
Da |
10 |
Vi |
11 |
Vo |
11 |
2001 border
- All data
- Long hem to the right
data:image/s3,"s3://crabby-images/54f17/54f1773f28d14bdc3f1e3af9f52b201f90f0c646" alt="image.png"
- Color coded by format
- Carnival format is less frequent, but borders are higher
- The attack format has a long hem distribution to the right
- Groove has two mountains
data:image/s3,"s3://crabby-images/6799a/6799aa967164365575199293b268a999d7508cc9" alt="image.png"
period
- All data
- No data in Carnival format due to lack
data:image/s3,"s3://crabby-images/ba27c/ba27cc81a846f6c2e1b8ac1dab7bc29195ee0ba3" alt="image.png"
Relationship between variables
By format
2001 border
- Carnival: Omitted because there are only 3 data
- Groove: Overall the lowest. Median <mean.
- Attapon: Middle in three formats. Median <mean. There are outliers on the top.
format |
Carnival |
Groove |
Attapon |
count |
3 |
32 |
60 |
mean |
188751.7 |
87048.6 |
94924.4 |
std |
19012.5 |
35318.9 |
42349.2 |
min |
176743 |
40096 |
42944 |
25% |
177791.5 |
52942.5 |
67515.8 |
50% |
178840 |
84560 |
80589.5 |
75% |
194756 |
114458 |
112983 |
max |
210672 |
170014 |
224697 |
data:image/s3,"s3://crabby-images/b3cc7/b3cc77f1d0521a4d0791d5c2a33ed063061a7977" alt="image.png"
period
- Carnival: No data
- Groove: 174 for the lower half and above
- Attapon: min is smaller than Groove and max is larger than Groove. There are outliers on
format |
Carnival |
Groove |
Attapon |
count |
0 |
32 |
60 |
mean |
NaN |
188.3 |
181.0 |
std |
NaN |
18.1 |
19.3 |
min |
NaN |
174 |
150 |
25% |
NaN |
174 |
174 |
50% |
NaN |
174 |
174 |
75% |
NaN |
198 |
198 |
max |
NaN |
222 |
249 |
data:image/s3,"s3://crabby-images/bdf1f/bdf1f910f0bdb4e7b0fbdbb55a5ad309a629cd50" alt="image.png"
For each attribute
- Data format is Groove only
- 2001 rank border
- Da: Average, median, minimum, and maximum are all the lowest
- Vi: Average / minimum / maximum is the highest
- Vo: Highest median
attribute |
Da |
Vi |
Vo |
count |
10 |
11 |
11 |
mean |
81120.2 |
92813.3 |
86673.3 |
std |
32024.0 |
38595.9 |
37182.3 |
min |
40096 |
46300 |
42544 |
25% |
54767.3 |
64639.5 |
49871 |
50% |
78106.5 |
82143 |
100476 |
75% |
110899.3 |
110700 |
118864.5 |
max |
127482 |
170014 |
140000 |
data:image/s3,"s3://crabby-images/d0d50/d0d50b90e3d321f1f94832bd216fb273254696c3" alt="image.png"
period
- Correlation matrix: seems to be uncorrelated
- I think that the borderline relationship is not zero because you can earn points if the period is long, but other factors seem to be stronger.
|
2001 |
period |
2001 |
1.0 |
-0.126 |
period |
-0.126 |
1.0 |
data:image/s3,"s3://crabby-images/cce3a/cce3aa3a69cd322c2475f5676eb489e7f0395733" alt="image.png"
date
2001 border
data:image/s3,"s3://crabby-images/47c03/47c03d91a27feea0cf635c36b9a7f7a62faaecf2" alt="image.png"
data:image/s3,"s3://crabby-images/7279b/7279baefd10e446ffcd39aa0081214a2e106deb8" alt="image.png"
period
- Early events have exceeded 200h, but recently it has been fixed at 160h ~ 200h
data:image/s3,"s3://crabby-images/b208d/b208d60e0dac71b7deb52b09d2f4c0c92ef2ef49" alt="image.png"
Summary
- format
- If you want to dig deep, it seems better to start with the attacker with the largest amount of data.
- Carnival has a small amount of data, so analysis is difficult
- Period
- I thought that the longer the period, the higher the border, but it was not so
- I don't think the relationship is zero, but other factors may be stronger.
- Date
- As a tendency, it seems to be gradually rising
- The factors that come to mind are as follows
- The number of players has increased and the battle has become fierce
- It's easier to earn points
- The number of characters has increased (such as ability inflation and the number of people who can be left unattended)
- Grand live performance has improved time efficiency
- Other possible factors
- Variations for each event
- Popular event ranking reward idols