Forums

General Betting

Welcome to Live View – Take the tour to learn more
Start Tour
There is currently 1 person viewing this thread.
Equimine.co.uk
30 May 10 11:02
Joined:
Date Joined: 11 Oct 07
| Topic/replies: 471 | Blogger: Equimine.co.uk's blog
I have recently completed the long process of creating a  racehorse pedigree performance database. It covers from 1996 to date and(on the flat,NH on its way) covers over one million racecourse appearances.
This has enabled me to give ratings, and probabilities, of sire, dam or damsire progeny, from good sample sizes . The ratings cover age,distance, going and age/distance combined.
In other words it calculates the probability of Montjeu siring a three yr old colt or gelding rated between 90 and 105 (for instance) running over 10 to 12 furlongs, or of Danehill being the damsire of a similar progeny. The probabilities have been achieved using normal distribution methods, and run on Excel. The racehorse ratings are our own handicap ratings generally two or three pounds below RPost ratings.
I would ideally now like to create probabilities for progeny based upon sire “X”, dam “Y” and Dam Sire “Z”,or combinations of any two from three. Without getting into a discussion on genetic theory etc. I am not sure what method of probability calculation to use.
Any thoughts would be greatly appreciated, I was thinking a trying to apply Bayes principles.
Thanks in advance.
Pause Switch to Standard View What probability method for a database?
Show More
Loading...
Report Lori May 30, 2010 12:07 PM BST
I think that you're actually going to need to go into genetic theory to determine how to do this.

I have zero knowledge on that subject, so if this is crude then I apologise.

IMO you need to make some assumptions about which parent the horse is going to take the genes from to work out which probability you're going to need. If it's always 50-50 then you'd do it one way, if it follows a Normal Distribution you'd do it another and if it's skewed based on some kind of "gene strength" factor, you'd do it yet another.

When confronted with decisions like this, I tend to derive several different ratings and just keep them apart from each other and get a feel for the differences over time to help me decide which rating suits me best.
Report cornubia May 31, 2010 8:24 PM BST
Bayes or logistic regression. Bayes is simpler but you cannot do dam and damsire as they are related in the maths and family sense. You may find that samples get too small and you cannot rely on the data for small divisions.
Post Your Reply
<CTRL+Enter> to submit
Please login to post a reply.

Wonder

Instance ID: 13539
www.betfair.com