Database Handicapping Software- JCapper

JCapper Message Board

          JCapper 101
                      -- Starting Factors

Home Register
Log In
By Starting Factors
Caveat
8/9/2012
7:36:50 AM

Hi All...

I an still new to this and learning..spent a lot of time in old message boards , last week or so..

I'm playing around with constructing UDM's and would like to ask a question..

In doing a track UDM , I believe that a good starting factor would be UPR, CPACE, AFR ,CFA...etc..
What would you do when none of those are near the top in ROI?
I have factors like these near the top...

code:
•	FASTSLOWFINAL       103     32      0.3107  2.5241    1.1738    48        0.466   0.9112  
• CXN 103 31 0.301 2.4453 1.1563 51 0.4951 1.0165
• WILLTOWIN 176 19 0.108 0.8774 1.1313 44 0.25 0.9568
• LASTRACEBRISFIG 111 33 0.2973 2.4152 1.1 52 0.4685 0.9221
• COMPOUNDLATE 103 18 0.1748 1.42 1.0893 30 0.2913 0.7447
• LATESLANT 103 25 0.2427 1.9716 1.0592 42 0.4078 0.932
• BETTORSTOTEPROB 103 45 0.4369 3.5493 1.0282 65 0.6311 0.984
• CONSISTENCY 103 26 0.2524 2.0504 1.0243 46 0.4466 0.9995
• POST TIME FAVS 115 49 0.4261 3.4616 1.0209 71 0.6174 0.9752
• QSPEEDPOINTS 131 33 0.2519 2.0464 1.0038 58 0.4427 1.0011
• TURNTIME 103 26 0.2524 2.0504 0.9728 42 0.4078 0.8068
• COMPOUNDE1 103 25 0.2427 1.9716 0.967 41 0.3981 0.8573
• AVGE1 105 26 0.2476 2.0115 0.9638 43 0.4095 0.8124



Ths
Mike

Reply
Charlie James
8/9/2012
11:28:03 AM
Imho, you are [still] ignoring some very good advice given out in the private section of this board. That advice went something like this: Model the big picture and don't cherry pick from among small sample results.

The danger in cherry picking from small sample results is that U R going down a path likely 2 get U the world's biggest back-fit. And when that back-fitted model doesn't produce the same good results going fwd U sit there wondering why.

[Not U personally -- but U meaning newbie players in general.]

We were lucky enough to have the program's author present us with a "what it takes" write up -- backed up by 4 years of data.

I'm going to ask a serious question:

Do U believe U have the requisite talent and insight into what it takes in this game to go down a path that ignores the advice U were given?

~Edited by: Charlie James  on:  8/9/2012  at:  11:25:43 AM~

~Edited by: Charlie James  on:  8/9/2012  at:  11:27:31 AM~

~Edited by: Charlie James  on:  8/9/2012  at:  11:28:03 AM~

Reply
Charlie James
8/9/2012
12:07:25 PM
Example -- SAR dirt sprints 2011:


code:
     query start:         8/9/2012 9:37:21 AM
query end: 8/9/2012 9:37:23 AM
elapsed time: 2 seconds

Data Window Settings:
Connected to: C:\JCapper\exe\JCapper2.mdb
999 Divisor Odds Cap: None
Betting Instructions: Testing Purposes Only

UDM: 0_1TrackDateExpression

SQL: SELECT * FROM STARTERHISTORY
WHERE TRACK='SAR'
AND INTSURFACE <= 3
AND DIST < 1760
AND [YEAR] = 2011


Data Summary Win Place Show
Mutuel Totals 2067.00 2189.90 2122.30
Bet -2628.00 -2628.00 -2628.00
Gain -561.00 -438.10 -505.70

Wins 177 356 505
Plays 1314 1314 1314
PCT .1347 .2709 .3843

ROI 0.7865 0.8333 0.8076
Avg Mut 11.68 6.15 4.20


****************************************************************************************
Key Factors Rank = 1 sorted by Win ROI Run Date: 8/9/2012 9:37:23 AM
****************************************************************************************
WIN WIN WIN PLACE PLACE
FACTOR PLAYS WINS PCT IMPACT ROI PLACES PCT ROI
****************************************************************************************
COMPOUNDAP F18 182 59 0.3242 2.4068 1.1637 82 0.4505 0.9255
FIGCONSENSUS F13 191 59 0.3089 2.2932 1.1301 92 0.4817 1.0099
JPRCLASS F28 172 57 0.3314 2.4602 1.0718 86 0.5 0.9945
JRATING F29 172 50 0.2907 2.1581 1.0613 75 0.436 0.9087
USERFACTOR3 F33 172 51 0.2965 2.2011 1.048 74 0.4302 0.8547
POWERCONSENSUS F32 180 59 0.3278 2.4335 1.0342 86 0.4778 0.8928
UPR 172 55 0.3198 2.3741 1.0337 86 0.5 0.9721
COMPOUNDSP F24 182 51 0.2802 2.0801 1.0327 84 0.4615 0.9819
USERFACTOR4 F34 172 52 0.3023 2.2442 1.0203 78 0.4535 0.8919
TPACE F11 182 55 0.3022 2.2435 1.0091 87 0.478 1.0195
CLASSCONSENSUS F27 193 63 0.3264 2.4231 1.0085 94 0.487 0.9554
WEIGHTEDFIG F12 182 49 0.2692 1.9985 1.0077 74 0.4066 0.8624
UPRMLPROB 173 61 0.3526 2.6176 1.0061 91 0.526 0.9624
JPRMLPROB 173 57 0.3295 2.4461 0.9971 87 0.5029 0.9318
PACEFIG F10 184 53 0.288 2.138 0.9965 83 0.4511 0.9383
JPR 172 53 0.3081 2.2873 0.9826 82 0.4767 0.9238...







Now -- SAR dirt sprints 2012:


code:
     query start:         8/9/2012 9:40:15 AM
query end: 8/9/2012 9:40:16 AM
elapsed time: 1 seconds

Data Window Settings:
Connected to: C:\JCapper\exe\JCapper2.mdb
999 Divisor Odds Cap: None
Betting Instructions: Testing Purposes Only

UDM: 0_1TrackDateExpression

SQL: SELECT * FROM STARTERHISTORY
WHERE TRACK='SAR'
AND INTSURFACE <= 3
AND DIST < 1760
AND [YEAR] = 2012


Data Summary Win Place Show
Mutuel Totals 1039.60 1014.40 919.40
Bet -1088.00 -1088.00 -1088.00
Gain -48.40 -73.60 -168.60

Wins 69 140 210
Plays 544 544 544
PCT .1268 .2574 .3860

ROI 0.9555 0.9324 0.8450
Avg Mut 15.07 7.25 4.38




****************************************************************************************
Key Factors Rank = 1 sorted by Win ROI Run Date: 8/9/2012 9:40:16 AM
****************************************************************************************
WIN WIN WIN PLACE PLACE
FACTOR PLAYS WINS PCT IMPACT ROI PLACES PCT ROI
****************************************************************************************
PEDIGREE F15 70 13 0.1857 1.4641 1.2836 31 0.4429 1.3221
CPACE F20 69 16 0.2319 1.8283 1.158 25 0.3623 0.813
EARLYCONSENSUS F19 74 17 0.2297 1.811 1.1473 28 0.3784 0.9014
MORNINGLINE 71 29 0.4085 3.2206 1.0718 39 0.5493 0.8915
AFR F01 69 15 0.2174 1.714 0.9746 24 0.3478 0.7645
USERFACTOR4 F34 69 18 0.2609 2.057 0.9565 25 0.3623 0.6891
BETTORSTOTEPROB 70 26 0.3714 2.9281 0.9129 40 0.5714 0.92
USERFACTOR3 F33 69 19 0.2754 2.1713 0.8877 29 0.4203 0.8739
WOBRILL F04 70 14 0.2 1.5768 0.8814 21 0.3 0.7121
PRIME F31 73 22 0.3014 2.3763 0.8589 38 0.5205 0.9164
POST TIME FAVS 71 25 0.3521 2.776 0.8542 41 0.5775 0.931
CLASSCONSENSUS F27 71 20 0.2817 2.2209 0.8275 29 0.4085 0.7549
FORM F03 72 14 0.1944 1.5327 0.8174 23 0.3194 0.8694
BASICFITNESS F02 105 19 0.181 1.427 0.7633 30 0.2857 0.8033
CFA F08 69 16 0.2319 1.8283 0.7616 27 0.3913 0.7355
RACESTRENGTH F16 70 19 0.2714 2.1397 0.7593 29 0.4143 0.7021
JPRCLASS F28 69 19 0.2754 2.1713 0.7543 31 0.4493 0.7725
JPR 69 17 0.2464 1.9426 0.7442 30 0.4348 0.7812
OPTIMIZATION F30 79 14 0.1772 1.3971 0.7437 22 0.2785 0.6468
USERFACTOR5 F05 69 11 0.1594 1.2567 0.7196 21 0.3043 0.9645
FORMCONSENSUS F07 72 13 0.1806 1.4239 0.7076 19 0.2639 0.6979
UPRMLPROB 69 19 0.2754 2.1713 0.6978 30 0.4348 0.6957
COMPOUNDSP F24 70 17 0.2429 1.915 0.6836 29 0.4143 0.7364
FASTSLOWFINAL F09 70 16 0.2286 1.8023 0.6829 30 0.4286 0.8007
COMPOUNDAP F18 69 17 0.2464 1.9426 0.6812 30 0.4348 0.8565...





In 2011 CompAP rank=1 was the top factor [32% winners 1.16 roi.] But so far in 2012 CompAP rank=1 only 24% winners and 0.68 roi.

Q. Why?

A. Speaking strictly for myself, I haven't the 1st clue. The best insight I can come up with after looking at this and hundreds of similar [small] data samples is:

Because that's the way horse racing data behaves.

Q. Caveat, can you [or anyone else?] provide me with insight -- backed up by reasoning and data samples -- that files in the face of this? -- Can U or anyone else tell me WHEN to expect cherry picked results to perform well going fwd and why?

Because if I could only know ahead of time when to cherry pick and what to cherry pick [and why] -- this game would be ridiculously EZ to beat.

Until such time as that happens I will continue on the path that has produced reasonably good results -- at least 4 me: I will stick 2 modeling the big picture.



~Edited by: Charlie James  on:  8/9/2012  at:  12:05:21 PM~

~Edited by: Charlie James  on:  8/9/2012  at:  12:07:25 PM~

Reply
jeff
8/9/2012
1:58:04 PM
Searching through past posts on the same topic, I came up with the following thread:
http://www.jcapper.com/messageboard/TopicReader.asp?topic=1105&forum=JCapper%20101

I thought Steve's reply was relevant and the bolded text from his quote was put there by me to emphasize what he said:


--quote:
"I agree with the above. All too often, factors which do a great job of filtering out losers and boosting your ROI in your development sample, end up filtering out winners in your fresh data. You always want to keep independent data available to test anything you're doing."
--end quote


In my opinion, after creating a UDM, any UDM, be it based on any concept that involves R&D using large sample or small: You want to validate performance going forward by confronting the UDM with races from outside the sample used when developing the UDM. And hold off betting real money on the UDM until you see clear evidence that the concept encapsulated in the UDM is "validating" or performing well going forward in time.




I also found the following thread on Track Profile Theory:
http://www.jcapper.com/MessageBoard/TopicReader.asp?topic=135&forum=General

Speaking strictly for myself (and admittedly it is an acquired skill) I have had success in the past when I am able to relate Data Window results to a physical cause.

For example, the SAR 2012 dirt sprint results that Chuck posted above, I see clear evidence of an early speed bias. I say that because I know at a glance that CPace and EarlyConsensus performance in those results is above historical norms.

But what cements it for me is watching races run there so far this meet. The leaders aren't getting tired. Also, a look at the overhead camera "snapshot" talked about in the Track Profile Theory thread indicates (at least to me) that the numbers in the Data Window aren't the result of some random cosmic accident - that they are in fact being produced because there is an actual speed bias.

Q. What causes that bias? Is it the weather? Humidity? Track maintenance? Or are other unexplained phenomena at work?

A. I haven't the first clue. But I do know from looking at video of the horses and from looking at overhead snapshots of where the horses are when the winner breaks the plane of the finish line and from numbers in the Data Window that a speed bias is there.

Q. Will that speed bias continue?

A. I haven't the first clue. If you believe in track profile theory - go for it.

If it suddenly reverses tomorrow: don't be surprised.

If it holds up through closing day - but weather, humidity, track maintenance next year cause the same surface to favor closers: don't be surprised by that either.


-jp

.

Reply
Charlie James
8/9/2012
4:52:01 PM
Jeff, Love the overhead snapshot concept. Brilliant if U ask me.

Fwiw I've never personally been able to lock into a speed bias early enough to take advantage. By the time I see it so has everybody else -- and the boxcar prices have already been paid out. By the time I pick up on it, oh the bias is still there -- but the pubilc k_n_o_w_s and the result is a chalk parade [which causes the trainers to complain to management who in turn tells the track super to harrow deeper.]

For some reason I have no problem using speed and separation as the universal bias. Imho still the single best trip to the winners circle [even after the advent of polyshite.] I also have no problem working to educate myself -- to come up with a short list of trainers with the proven talent to prep their babies in such a way to take full advantage of the so called universal bias. -- Made easier I guess by years of going to KEE each fall to follow who bought what and for whom -- and then watch it all unfold the following spring as the babies grow up and work their way through the condition book.

A good buddy of mine used to go to Fla and then later in life Ariz every March to watch the kids in the farm system vie for spots in the big leagues. Point is -- follow this or any game closely enough and going beyond the numbers [I like that phrase] suddenly within reach.

Different ways to skin a cat I guess.

~Edited by: Charlie James  on:  8/9/2012  at:  4:50:18 PM~

~Edited by: Charlie James  on:  8/9/2012  at:  4:52:01 PM~

Reply
jeff
8/10/2012
12:06:16 AM
Mike, I wanted to make a few specific comments about your post.

In my way of doing things, there are two different types of UDMs and each has a very different yet specific purpose:

1. The Business UDM -When Chuck posted the words "model the big picture" (and bluntly I might add) I'm about 99% sure he was talking about Business UDMs.

In JCapper terminology, a Business UDM is a UDM designed to point out horses that are very close to being automatic bets - those that have lots of hidden positive attributes in their past performance records. In the "what it takes" write up, I hope I was able to make the point that the phrase positive hidden attributes is in no way limited to "handicapping" in the traditional sense. (In fact, the topic of "handicapping" was purposely avoided.) Instead of basing the modeling process around attributes tied to the horse - the process was turned on its head - and the "handicapping" was instead focused on (less than perfect) public betting behavior.

When a business UDM flags a horse on one of my reports - it (rightly) deserves my intense focus. I say that because years of Data Window R&D and wager history analysis very clearly tells me those are the UDMs driving my profits.

That last sentence describes what is meant by the term "Business UDM."

By the way, I fully agree with Chuck. When creating a Business UDM, forget the small sample and the track specific. Model the big picture instead.

2. The Layerng UDM - In JCapper terminology, the Layering UDM can be anything (small sample or large) that adds an additional "layer" of knowledge to the player's understanding of something related to either the race or individual horses in the race. A layering UDM can literally be based on anything.

Last Thurs I spent a day at DMR and ran into John Doyle. For those of you who may not be aware, John was the overall winner of the NHC tournament in Jan 2011. (No, John is not a JCapper guy.) He mostly uses the DRF and a ball point pen.

IMHO, if horse racing were chess John would be a grand master while the rest of us (myself included) would be local tournament players at best. It's both scary and amazing how quickly he subconsciously homes in on patterns.

Anyway, we came to a Mclm race with a Sadler FTS in it. John instantly knows that Sadler is 3 for 8 at DMR with FTS... win pct .375. Quick odds conversion = approx 8/5.

Conversely, John says that Sadler is 1 for 11 with FTS in SPLWT at DMR and wouldn't touch a Sadler FTS in a SPLWT at DMR with my money... we'll maybe with MY money! but definitely not his money.

Me? I sense small sample syndrome at work. Not liking anything else in the race I pass.

John loads up on what he sees as very generous odds (7/2) and watches his Sadler FTS stalk the pace while in hand, pull even at about the 1/8th pole - and then win going away late.

Later that night over beers and dinner at a nice place, John and I are talking "shop." It occurs to me that in this case, even though he's not a JCapper guy - John had a "Layering" UDM in his head based on Sadler FTS's in MClm races at DMR.

It also occurs to me that John won the NHC and I didn't. That fact is not lost on me.

(It's not like I didn't have a good day myself. I did. But it's just like Chuck says... many ways to skin a cat.)




Below are a couple of links to some older posts in the private section of the board where I laid out a few of my thoughts on Layering UDMs and how best to use them.

JCapper Under The Hood - Jan, 2011:
http://www.jcapper.com/MessageBoard/TopicReader.asp?topic=903&forum=Private

Stunned by JCapper yet again - June, 2012:
http://www.jcapper.com/MessageBoard/TopicReader.asp?topic=1277&forum=Private




Wrapping this up... if that's even possible... the Layering UDM absolutely CAN be based on the small sample and the track specific. Some of mine are.

My best successes always seem to come about where Business UDM and Layering UDM meet.


-jp

.



~Edited by: jeff  on:  8/10/2012  at:  12:06:16 AM~

Reply
Caveat
8/10/2012
11:30:03 AM
Thxs Charlie!!
While I was at work yesterday, I took a peek to see if I got a response..I glance over it quickly and after I saw the charts that you put up.. then it hit me.
At first when you mentioned back-fitting and the big picture , I wasn't sure what you were talking about and was shy to ask.
Now, with those charts, I can see if I had built a UDM based on last years numbers..it would have been disastrous come 2012.
2011 showed average paced horses having the advantage...this year it would be early horses having the advantage.
You asked what could be cherry picked, I'm guessing maybe trainers, connections , post bias...others ..if the data shows up again on those factors
Databases is completely new to me ..so please be patient :)
Moving forward, I now have data on what happened way back and data on whats happening now , I would put an emphasis toward early
Jeff, thxs for taking the time to post a reply, I will get to your stuff..soon

Mike

Things have to sink in little by little

~Edited by: Caveat  on:  8/10/2012  at:  9:25:08 AM~

Stupid me...I was thinking that there was only one page in JCP 101...cause I didnt find page numbers at the bottom...
Theres a back button!!..WOW...Tons of reading :)

~Edited by: Caveat  on:  8/10/2012  at:  11:06:26 AM~

~Edited by: Caveat  on:  8/10/2012  at:  11:06:53 AM~

~Edited by: Caveat  on:  8/10/2012  at:  11:30:03 AM~

Reply
Windoor
8/11/2012
9:47:02 PM
I will add my two cents on why one years results are (can be) so much different from one year to the other and what you might be able to do about it.

Need I say it? All in my most humble opinion.

I believe there are many different kinds of races other than simply Maiden, Claiming, Allowance, Stakes ,Handicap, and variations of same. Having said that, I believe they all can be broken down by Track, Distance, Class, Age, Surface (including today's variant) Sex and time of year.

I call this " The Seven" and is how I separate the different kinds of races. When you consider all of the sub categories for each and possible combinations of them, there are many, many and more to take into consideration. You now see just how complex our problem can be when deciding on what factors to use in our UDM's.

Some factors can transcend many categories. Others, like some performance factors (speed and pace) can drastically change with the track conditions, and this can happen on a daily basis at some tracks. I tend to stay away from them (performance factors) even though they can indeed show a healthy win percent. It's the average odd I object to.

So what changed from this year to last year than made such a drastic difference in the ROI? As mentioned above, it could be a simple track bias. It also could be the "type" of races being run. Maybe a lot more of "non winners of three" or one, or two, or conditional allowance races, etc. It can be many things. Knowing what is going to be the dominant factor today, for this "type" of race is the real challenge in my view.

I now have Seven Key factors (only one is based on speed), Seven Primary factors, and Seven Secondary factors. They all have value, but some really shine when a specific "type" of race comes along.

I use to have a signature statement that says, " The Numbers Have Hinges". This is a reference to factors who's "Value" has change due to the type of race being run. A top ranked factor that used to give us a lot of winners, may now be nearly useless due to changes in the track surface or "types" of races being run.

I would recommend to anyone who is still struggling to maintain a profit, to break down their plays by the Seven. Pick one from each category, and build a UDM for it. Test it for a three year period (or more), or at least a few hundred consecutive races. Then support it with a small bank to see if it grows. It may very well work at other tracks, class levels, distances or any of the seven. Only a large database that can show enough consecutive plays can tell you if it has value or not.

Even then, there is no guarantee that it will work going forward. I start each with a very small bank, and only increase the wager when the bank has grown enough to support it.

Greed and impatience kills. The discipline to wait for them is also mandatory.

Regards,

Windoor.


~Edited by: Windoor  on:  8/11/2012  at:  9:45:37 PM~

~Edited by: Windoor  on:  8/11/2012  at:  9:47:02 PM~

Reply
Charlie James
8/11/2012
11:21:08 PM
--Quote:
"

So what changed from this year to last year than made such a drastic difference in the ROI? As mentioned above, it could be a simple track bias. It also could be the "type" of races being run. Maybe a lot more of "non winners of three" or one, or two, or conditional allowance races, etc. It can be many things. Knowing what is going to be the dominant factor today, for this "type" of race is the real challenge in my view.


"

--End Quote.




Re: the bolded part -- Truer words never spoken.

Sharp post from start to finish.

~Edited by: Charlie James  on:  8/11/2012  at:  11:21:08 PM~

Reply
Caveat
8/12/2012
7:48:06 AM


Thxs guys...the knowledge is building :)

Mike

Reply
Reply

Copyright © 2018 JCapper Software              back to the JCapper Message Board              www.JCapper.com