[PYTHON] Dereste event attack format 1st place point estimation -State space model using Pystan-
Overview
- I learned how to analyze a state-space model using Stan in a book, so I practiced it with actual data.
- The goal is "estimate the first point at the end of the event at the start date of the event"
- This time, I used the "local linear trend" + "time variable" model.
- ** As a result, it seems that a model with sufficient accuracy was created **
- The first place is "trend" and "length of event period", which I could almost explain.
- After all, it was easier than Prediction of 2001
- It seems that the "excitement of each event" does not affect the 1st place so much, and the apts run for the full time with the highest efficiency.
- [Place here](https://github.com/kzy611/qiita/tree/master/dereste_analysis/dereste_event_point_ranking_border/%E3%82%A2%E3%82%BF%E3%83] % 9D% E3% 83% B3_1 / script).
- Forecast of 2001 is difficult, and I am considering explanatory variables, so I tried it as a change of mood.
Data to use
Data acquired by here extracted only from the format of Attapon
- 1st place point (objective variable)
data:image/s3,"s3://crabby-images/49879/498791517986c5f5a1182d42646bfb3a116ac4e9" alt="image.png"
- Length of event period (h) (explanatory variable)
data:image/s3,"s3://crabby-images/571cc/571cc2b0bd56efd5efb7119a1c7994f9a15ba05a" alt="image.png"
"Local linear trend" + "time variable" model
- [See Kochi for formulas](https://qiita.com/kazuya_minakuchi/items/09b010927688b322df9d#%E3%83%AD%E3%83%BC%E3%] 82% AB% E3% 83% AB% E7% B7% 9A% E5% BD% A2% E3% 83% 88% E3% 83% AC% E3% 83% B3% E3% 83% 89% E3% 83% A2% E3% 83% 87% E3% 83% AB)
- Red line is point estimation, red range is interval estimation (5% -95% interval)
- The correct answer rate for interval estimation for the data used to create the model is 91.7% (total number: 60, number of correct answers: 55).
Trend component (A)
data:image/s3,"s3://crabby-images/ca81c/ca81c3968433ad4291309db16db071efce4bce7f" alt="image.png"
Component by "length of event period" (B)
data:image/s3,"s3://crabby-images/fea77/fea776a2805573624ffcdb8e13c17ebd7637dd85" alt="image.png"
- Coefficients of time variables
data:image/s3,"s3://crabby-images/d36ea/d36ea5f1962ff79f797ec9aadb80d1efb47460b0" alt="image.png"
Predictive model (A + B)
data:image/s3,"s3://crabby-images/686a0/686a006cfda05ae48c4fe2fd8d73ce911b857b14" alt="image.png"
- Comparing the trend component and the coefficient, there is a tendency that "trend is smooth" and "coefficient changes suddenly".
- Factors that affect time efficiency, such as "addition of high-performance characters" and "implementation of Grand Drive", seem to have been added to the coefficient.
Let's see the ratio of actual measurement and prediction
- Let's look at the ratio as "ratio = measured value / predicted value"
- The idea that "the larger the value, the larger the deviation tends to be."
- Perfect prize for "ratio = 1"
- Predicted value <measured value when 1 or more. Measured value <predicted value below -1
Statistics
- The average is about 1 (this is natural because the model is made aiming at that)
- With standard deviation 0.043 = normal distribution, 68% falls within ± 4.3%
- Maximum deviation is about 13% (min, max)
- 50% is less than 3.3% (25% -75%)
|
Misalignment |
count |
60 |
mean |
0.9977 |
std |
0.0439 |
min |
0.8816 |
25% |
0.9679 |
50% |
1.0000 |
75% |
1.0199 |
max |
1.1269 |
plot
- Histogram
- Close to normal distribution
data:image/s3,"s3://crabby-images/27e29/27e295871c9f6dab194c36aa9d846f4ffb2bd9ba" alt="image.png"
data:image/s3,"s3://crabby-images/53cee/53cee5dba902df1f691fa3b0ae59f274396c05ae" alt="image.png"
I also looked at them in descending order and by month, but there was not much regularity.
Estimate the actual event 1st place
-
To create the model of ↑, [Data before Hero Versus Reinanjo in the event list](https://imascg-slstage-wiki.gamerch.com/%E3%82%A4%E3%83%99%E3 % 83% B3% E3% 83% 88% E3% 83% 87% E3% 83% BC% E3% 82% BF), so the next [Orange Time](https: // imascg-) slstage-wiki.gamerch.com/%E3%80%90%E3%82%A4%E3%83%99%E3%83%B3%E3%83%88%E3%80%91%E3%82%AA I tried to estimate the 1st place point of% E3% 83% AC% E3% 83% B3% E3% 82% B8% E3% 82% BF% E3% 82% A4% E3% 83% A0)
-
"Orange Time" information
-
Date: 2020/09/20
-
Period: 174h
-
1st place point (answer): ** 1,250,000 **
-
Predicted value
-
Point estimation: ** 1,222,458 **
-
Interval estimation (90%): ** 1,102,110 ~ 1,344,647 **
-
The actual value is included in the interval estimation, and the estimation is successful.
data:image/s3,"s3://crabby-images/2ae09/2ae099cf4dd2dcd610e7acaef5e4e3d73845a218" alt="image.png"