[PYTHON] Acquire multiple Excels at once with glob (Borace machine learning prediction result, confirmation for June)

Introduction

On the boat race triple forecast site "Today, do you have a good forecast?", the hit rate and recovery rate of daily race forecasts are disclosed without hiding. However, I wanted to put together the daily forecast results once a month, so I thought about processing like the title.

My situation

We organize the results of boat races and machine learning predictions and save them daily in a format such as "result_2020mmdd.csv". I want to put these files together once a month and visualize the results ...

And I want to omit the files like result_202006 .. ** _ test ** .csv which are mixed in some places as shown in the figure below because they are for testing.

image.png

The code looks like this.

It's very simple.


import pandas as pd
import glob

csv_files = glob.glob("predict/result/result_202006??.csv")
filelist = []
for file in csv_files:
    filelist.append(pd.read_csv(file))

df = pd.concat(filelist)

When I get the path name with glob, I use result_202006 ** ?? ** .csv. One? Will take charge of any one character.

csv_files has a nice set of files + pathnames. image.png

Append one by one with a for statement, and finally make it into a DataFrame and complete! is.

By the way, how was the result in June ..

I checked the result using the data frame I created earlier. The triple-unit hit rate was 10% as designed, but the recovery rate is just over 80%. (However, looking at other free forecasts, it may seem like you can still fight relatively well at this point.)

image.png

Here is the result of organizing with pivot_table after summarizing. ** It is interesting that the boat races that are easy to hit and the boat races that do not hit at all are clear. ** ** Let's buy only the boat races that are easy to hit this month ..

Site Hit Miss Payoff Return_ratio
Marugame 5 26 5480 176.77
Gamagori 3 21 3990 166.25
Tokuyama 9 33 6800 161.9
Biwa lake 9 31 5810 145.25
Lake Hamana 8 43 6030 118.24
Toda 4 46 5470 109.4
Edo River 3 24 2740 101.48
Naruto 6 44 5060 101.2
Tokoname 4 38 4190 99.76
Kojima 5 58 6230 98.89
Suminoe 2 33 3340 95.43
Fukuoka 2 14 1460 91.25
Ashiya 6 46 4540 87.31
Omura 8 51 4770 80.85
Karatsu 3 30 2600 78.79
Amagasaki 5 52 4270 74.91
Miyajima 5 36 2400 58.54
Heiwajima 3 40 2510 58.37
Tama River 3 32 1850 52.86
Shimonoseki 3 54 2760 48.42
Three countries 5 58 3030 48.1
Wakamatsu 4 56 1780 29.67
Kiryu 1 47 1190 24.79
Tsu 3 67 1290 18.43

Recommended Posts

Acquire multiple Excels at once with glob (Borace machine learning prediction result, confirmation for June)
Amplify images for machine learning with python
[Shakyo] Encounter with Python for machine learning
Update multiple tables at once with pandas to_sql
Convert multiple proto files at once with python