Hello, this is Aoki (@aoki_eng). This time, I tried to quantify the strength of racehorses in horse racing, so I will summarize it in this article.
Click here for github (https://github.com/katsuomi/keiba-BTmodel)
I love horse racing. Every weekend, I watch all the big races, called grade races, on TV and bet a small amount of money.
So, I often look at the past race results of competing horses, but I felt that I didn't know exactly how strong the horse was. For example How strong is this horse? There are so many races that are in the first place, so it seems to be reasonably strong!
How about this horse? There are so many races that are in the first place, so it seems to be reasonably strong!
Well, I can intuitively tell whether it is a strong horse or a weak horse, but I don't know how strong it is.
I want to express the strength of a horse concretely! !! Because of my curiosity, I decided to quantify it this time.
There are n elements (teams and individuals), and some kind of battle is to be played. A match is a one-to-one match, and the result is only victory or defeat against one element. Let's assume that the "strength" of each element is measured from the results of several battles. Here, when the probability that element i wins element j is Pij, for all combinations, Introduce πi. The relational expression of equation (1) is called the Bradley-Terry (BT) model. In the BT model, πi can be thought of as representing the strength of element i. It is said that the BT model can decide the victory or defeat through the battle with a third party even if there is no direct confrontation. (Quoted from here)
This article does not go into detail about the BT model. Put simply,
** It is a model that can reasonably show the strength of each element against things like one-on-one battles! ** **
(I can't express what I'm good at or weak at, such as rock-paper-scissors)
A common example ・ Let's show the strength of the Central League and Pacific League teams! ・ Let's show the strength of the J League team! There is something like.
Now let's think about horse racing. For example, if the result of a race is as follows Focusing on the second horse, ・ I lost to the first horse ・ Won against horses 3-18
It can be said that.
In this way, in horse racing "A racehorse vs racehorse match is taking place" I thought about applying the BT model.
・ From the official website of JRA, scraping and tabulating the race results from 2014 to 2018 ・ Apply the BT model to the result
The specific implementation method is posted here (https://github.com/katsuomi/keiba-BTmodel/blob/master/pointToHorseStrength.py)
The strongest racehorse among active horses is ** Almond Eye **!
This time, I used the BT model to show the strength of the racehorse. After all, the numbers of horses that are still active and horses that have been active in the past are high, and there is nothing that can be obtained in particular. (Lol)
Oops This weekend, there will be a race called Victoria Mile, where Almond Eye, the strongest racehorse in the field, is scheduled to run. It's a rough race every year, but ... !!!!
[reference] About Bradley-Terry model https://www.gavo.t.u-tokyo.ac.jp/~mine/japanese/IT/2017/toukei171211.pdf Regarding horse performance https://www.netkeiba.com/ Regarding past race information http://www.jra.go.jp/
Recommended Posts