| 
  
	
    	    JCapper Message Board 
            
            
   | 
  
     
   | 
  
 
  
  | By | 
   Starting Factors  | 
  
 
Caveat 8/9/2012 7:36:50 AM |  Hi All...
  I an still new to this and learning..spent a lot of time in old message boards , last week or so..
  I'm playing around with constructing UDM's and would like to ask a question..
  In doing a track UDM , I believe that a good starting factor would be UPR, CPACE, AFR ,CFA...etc.. What would you do when none of those are near the top in ROI? I have factors like these near the top...
 
 | code: | •	FASTSLOWFINAL       103     32      0.3107  2.5241    1.1738    48        0.466   0.9112   •	CXN                 103     31      0.301   2.4453    1.1563    51        0.4951  1.0165   •	WILLTOWIN           176     19      0.108   0.8774    1.1313    44        0.25    0.9568   •	LASTRACEBRISFIG     111     33      0.2973  2.4152    1.1       52        0.4685  0.9221   •	COMPOUNDLATE        103     18      0.1748  1.42      1.0893    30        0.2913  0.7447   •	LATESLANT           103     25      0.2427  1.9716    1.0592    42        0.4078  0.932    •	BETTORSTOTEPROB     103     45      0.4369  3.5493    1.0282    65        0.6311  0.984    •	CONSISTENCY         103     26      0.2524  2.0504    1.0243    46        0.4466  0.9995   •	POST TIME FAVS      115     49      0.4261  3.4616    1.0209    71        0.6174  0.9752   •	QSPEEDPOINTS        131     33      0.2519  2.0464    1.0038    58        0.4427  1.0011   •	TURNTIME            103     26      0.2524  2.0504    0.9728    42        0.4078  0.8068   •	COMPOUNDE1          103     25      0.2427  1.9716    0.967     41        0.3981  0.8573   •	AVGE1               105     26      0.2476  2.0115    0.9638    43        0.4095  0.8124  
   |  
 
 
  Ths Mike
 
  |  Charlie James 8/9/2012 11:28:03 AM | Imho, you are [still] ignoring some very good advice given out in the private section of this board. That advice went something like this: Model the big picture and don't cherry pick from among small sample results. 
  The danger in cherry picking from small sample results is that U R going down a path likely 2 get U the world's biggest back-fit. And when that back-fitted model doesn't produce the same good results going fwd U sit there wondering why. 
  [Not U personally -- but U meaning newbie players in general.]
  We were lucky enough to have the program's author present us with a "what it takes" write up -- backed up by 4 years of data. 
  I'm going to ask a serious question:
  Do U believe U have the requisite talent and insight into what it takes in this game to go down a path that ignores the advice U were given?
  ~Edited by: Charlie James  on:  8/9/2012  at:  11:25:43 AM~
  ~Edited by: Charlie James  on:  8/9/2012  at:  11:27:31 AM~
  ~Edited by: Charlie James  on:  8/9/2012  at:  11:28:03 AM~
 
  |  Charlie James 8/9/2012 12:07:25 PM | Example -- SAR dirt sprints 2011:
 
 
 | code: |      query start:         8/9/2012 9:37:21 AM      query end:           8/9/2012 9:37:23 AM      elapsed time:        2 seconds
       Data Window Settings:      Connected to: C:\JCapper\exe\JCapper2.mdb      999 Divisor  Odds Cap: None      Betting Instructions: Testing Purposes Only
       UDM: 0_1TrackDateExpression
       SQL:  SELECT * FROM STARTERHISTORY            WHERE TRACK='SAR'             AND INTSURFACE <= 3             AND DIST < 1760             AND [YEAR] = 2011
 
       Data Summary         Win     Place      Show      Mutuel Totals    2067.00   2189.90   2122.30      Bet             -2628.00  -2628.00  -2628.00      Gain             -561.00   -438.10   -505.70
       Wins                 177       356       505      Plays               1314      1314      1314      PCT                .1347     .2709     .3843
       ROI               0.7865    0.8333    0.8076      Avg Mut            11.68      6.15      4.20
 
       ****************************************************************************************      Key Factors Rank = 1 sorted by Win ROI                  Run Date: 8/9/2012 9:37:23 AM      ****************************************************************************************                                             WIN  WIN          WIN               PLACE   PLACE      FACTOR           PLAYS    WINS         PCT  IMPACT       ROI  PLACES         PCT     ROI      ****************************************************************************************      COMPOUNDAP F18      182     59      0.3242  2.4068    1.1637    82        0.4505  0.9255        FIGCONSENSUS F13    191     59      0.3089  2.2932    1.1301    92        0.4817  1.0099        JPRCLASS F28        172     57      0.3314  2.4602    1.0718    86        0.5     0.9945        JRATING F29         172     50      0.2907  2.1581    1.0613    75        0.436   0.9087        USERFACTOR3 F33     172     51      0.2965  2.2011    1.048     74        0.4302  0.8547        POWERCONSENSUS F32  180     59      0.3278  2.4335    1.0342    86        0.4778  0.8928        UPR                 172     55      0.3198  2.3741    1.0337    86        0.5     0.9721        COMPOUNDSP F24      182     51      0.2802  2.0801    1.0327    84        0.4615  0.9819        USERFACTOR4 F34     172     52      0.3023  2.2442    1.0203    78        0.4535  0.8919        TPACE F11           182     55      0.3022  2.2435    1.0091    87        0.478   1.0195        CLASSCONSENSUS F27  193     63      0.3264  2.4231    1.0085    94        0.487   0.9554        WEIGHTEDFIG F12     182     49      0.2692  1.9985    1.0077    74        0.4066  0.8624        UPRMLPROB           173     61      0.3526  2.6176    1.0061    91        0.526   0.9624        JPRMLPROB           173     57      0.3295  2.4461    0.9971    87        0.5029  0.9318        PACEFIG F10         184     53      0.288   2.138     0.9965    83        0.4511  0.9383        JPR                 172     53      0.3081  2.2873    0.9826    82        0.4767  0.9238...
 
   |  
 
 
 
 
 
  Now -- SAR dirt sprints 2012:
 
 
 | code: |      query start:         8/9/2012 9:40:15 AM      query end:           8/9/2012 9:40:16 AM      elapsed time:        1 seconds
       Data Window Settings:      Connected to: C:\JCapper\exe\JCapper2.mdb      999 Divisor  Odds Cap: None      Betting Instructions: Testing Purposes Only
       UDM: 0_1TrackDateExpression
       SQL:  SELECT * FROM STARTERHISTORY            WHERE TRACK='SAR'             AND INTSURFACE <= 3             AND DIST < 1760             AND [YEAR] = 2012
 
       Data Summary         Win     Place      Show      Mutuel Totals    1039.60   1014.40    919.40      Bet             -1088.00  -1088.00  -1088.00      Gain              -48.40    -73.60   -168.60
       Wins                  69       140       210      Plays                544       544       544      PCT                .1268     .2574     .3860
       ROI               0.9555    0.9324    0.8450      Avg Mut            15.07      7.25      4.38
 
 
 
       ****************************************************************************************      Key Factors Rank = 1 sorted by Win ROI                  Run Date: 8/9/2012 9:40:16 AM      ****************************************************************************************                                             WIN  WIN          WIN               PLACE   PLACE      FACTOR           PLAYS    WINS         PCT  IMPACT       ROI  PLACES         PCT     ROI      ****************************************************************************************      PEDIGREE F15        70      13      0.1857  1.4641    1.2836    31        0.4429  1.3221        CPACE F20           69      16      0.2319  1.8283    1.158     25        0.3623  0.813         EARLYCONSENSUS F19  74      17      0.2297  1.811     1.1473    28        0.3784  0.9014        MORNINGLINE         71      29      0.4085  3.2206    1.0718    39        0.5493  0.8915        AFR F01             69      15      0.2174  1.714     0.9746    24        0.3478  0.7645        USERFACTOR4 F34     69      18      0.2609  2.057     0.9565    25        0.3623  0.6891        BETTORSTOTEPROB     70      26      0.3714  2.9281    0.9129    40        0.5714  0.92          USERFACTOR3 F33     69      19      0.2754  2.1713    0.8877    29        0.4203  0.8739        WOBRILL F04         70      14      0.2     1.5768    0.8814    21        0.3     0.7121        PRIME F31           73      22      0.3014  2.3763    0.8589    38        0.5205  0.9164        POST TIME FAVS      71      25      0.3521  2.776     0.8542    41        0.5775  0.931         CLASSCONSENSUS F27  71      20      0.2817  2.2209    0.8275    29        0.4085  0.7549        FORM F03            72      14      0.1944  1.5327    0.8174    23        0.3194  0.8694        BASICFITNESS F02    105     19      0.181   1.427     0.7633    30        0.2857  0.8033        CFA F08             69      16      0.2319  1.8283    0.7616    27        0.3913  0.7355        RACESTRENGTH F16    70      19      0.2714  2.1397    0.7593    29        0.4143  0.7021        JPRCLASS F28        69      19      0.2754  2.1713    0.7543    31        0.4493  0.7725        JPR                 69      17      0.2464  1.9426    0.7442    30        0.4348  0.7812        OPTIMIZATION F30    79      14      0.1772  1.3971    0.7437    22        0.2785  0.6468        USERFACTOR5 F05     69      11      0.1594  1.2567    0.7196    21        0.3043  0.9645        FORMCONSENSUS F07   72      13      0.1806  1.4239    0.7076    19        0.2639  0.6979        UPRMLPROB           69      19      0.2754  2.1713    0.6978    30        0.4348  0.6957        COMPOUNDSP F24      70      17      0.2429  1.915     0.6836    29        0.4143  0.7364        FASTSLOWFINAL F09   70      16      0.2286  1.8023    0.6829    30        0.4286  0.8007        COMPOUNDAP F18      69      17      0.2464  1.9426    0.6812    30        0.4348  0.8565...  
 
   |  
 
 
 
  In 2011 CompAP rank=1 was the top factor [32% winners 1.16 roi.] But so far in 2012 CompAP rank=1 only 24% winners and 0.68 roi. 
  Q. Why?
  A. Speaking strictly for myself, I haven't the 1st clue. The best insight I can come up with after looking at this and hundreds of similar [small] data samples is: 
  Because that's the way horse racing data behaves.
  Q. Caveat, can you [or anyone else?] provide me with insight -- backed up by reasoning and data samples -- that files in the face of this? -- Can U or anyone else tell me WHEN to expect cherry picked results to perform well going fwd and why? 
  Because if I could only know ahead of time when to cherry pick and what to cherry pick [and why] -- this game would be ridiculously EZ to beat. 
  Until such time as that happens I will continue on the path that has produced  reasonably good results  -- at least 4 me: I will stick 2 modeling the big picture.
 
 
  ~Edited by: Charlie James  on:  8/9/2012  at:  12:05:21 PM~
  ~Edited by: Charlie James  on:  8/9/2012  at:  12:07:25 PM~
 
  |  jeff 8/9/2012 1:58:04 PM | Searching through past posts on the same topic, I came up with the following thread: http://www.jcapper.com/messageboard/TopicReader.asp?topic=1105&forum=JCapper%20101
  I thought Steve's reply was relevant and the bolded text from his quote was put there by me to emphasize what he said:
 
  --quote:"I agree with the above. All too often, factors which do a great job of filtering out losers and boosting your ROI in your development sample, end up filtering out winners in your fresh data. You always want to keep independent data available to test anything you're doing." --end quote
 
  In my opinion, after creating a UDM, any UDM, be it based on any concept that involves R&D using large sample or small: You want to validate performance going forward by confronting the UDM with races from outside the sample used when developing the UDM. And hold off betting real money on the UDM until you see clear evidence that the concept encapsulated in the UDM is "validating" or performing well going forward in time.
 
 
 
  I also found the following thread on Track Profile Theory: http://www.jcapper.com/MessageBoard/TopicReader.asp?topic=135&forum=General
  Speaking strictly for myself (and admittedly it is an acquired skill) I have had success in the past when I am able to relate Data Window results to a physical cause. 
  For example, the SAR 2012 dirt sprint results that Chuck posted above, I see clear evidence of an early speed bias. I say that because I know at a glance that CPace and EarlyConsensus performance in those results is above historical norms. 
  But what cements it for me is watching races run there so far this meet. The leaders aren't getting tired. Also, a look at the overhead camera "snapshot" talked about in the Track Profile Theory thread indicates (at least to me) that the numbers in the Data Window aren't the result of some random cosmic accident - that they are in fact being produced because there is an actual speed bias.
  Q. What causes that bias? Is it the weather? Humidity? Track maintenance? Or are other unexplained phenomena at work?
  A. I haven't the first clue. But I do know from looking at video of the horses and from looking at overhead snapshots of where the horses are when the winner breaks the plane of the finish line and from numbers in the Data Window that a speed bias is there.
  Q. Will that speed bias continue?
  A. I haven't the first clue. If you believe in track profile theory - go for it. 
  If it suddenly reverses tomorrow: don't be surprised. 
  If it holds up through closing day - but weather, humidity, track maintenance next year cause the same surface to favor closers: don't be surprised by that either.
 
  -jp
  .
 
  |  Charlie James 8/9/2012 4:52:01 PM | Jeff, Love the overhead snapshot concept. Brilliant if U ask me.
  Fwiw I've never personally been able to lock into a speed bias early enough to take advantage. By the time I see it so has everybody else -- and the boxcar prices have already been paid out. By the time I pick up on it, oh the bias is still there -- but the pubilc k_n_o_w_s and the result is a chalk parade [which causes the trainers to complain to management who in turn tells the track super to harrow deeper.]
  For some reason I have no problem using speed and separation as the universal bias. Imho still the single best trip to the winners circle [even after the advent of polyshite.] I also have no problem working to educate myself -- to come up with a short list of trainers with the proven talent to prep their babies in such a way to take full advantage of the so called universal bias. -- Made easier I guess by years of going to KEE each fall to follow who bought what and for whom -- and then watch it all unfold the following spring as the babies grow up and work their way through the condition book.
  A good buddy of mine used to go to Fla and then later in life Ariz every March to watch the kids in the farm system vie for spots in the big leagues. Point is -- follow this or any game closely enough and going beyond the numbers [I like that phrase] suddenly within reach.
  Different ways to skin a cat I guess.
  ~Edited by: Charlie James  on:  8/9/2012  at:  4:50:18 PM~
  ~Edited by: Charlie James  on:  8/9/2012  at:  4:52:01 PM~
 
  |  jeff 8/10/2012 12:06:16 AM | Mike, I wanted to make a few specific comments about your post.
  In my way of doing things, there are two different types of UDMs and each has a very different yet specific purpose:
  1. The Business UDM -When Chuck posted the words "model the big picture" (and bluntly I might add) I'm about 99% sure he was talking about Business UDMs. 
  In JCapper terminology, a Business UDM is a UDM designed to point out horses that are very close to being automatic bets - those that have lots of hidden positive attributes in their past performance records. In the "what it takes" write up, I hope I was able to make the point that the phrase positive hidden attributes is in no way limited to "handicapping" in the traditional sense. (In fact, the topic of "handicapping" was purposely avoided.) Instead of basing the modeling process around attributes tied to the horse - the process was turned on its head - and the "handicapping" was instead focused on (less than perfect) public betting behavior.
  When a business UDM flags a horse on one of my reports - it (rightly) deserves my intense focus. I say that because years of Data Window R&D and wager history analysis very clearly tells me those are the UDMs driving my profits. 
  That last sentence describes what is meant by the term "Business UDM." 
  By the way, I fully agree with Chuck. When creating a Business UDM, forget the small sample and the track specific. Model the big picture instead. 
  2. The Layerng UDM - In JCapper terminology, the Layering UDM can be anything (small sample or large) that adds an additional "layer" of knowledge to the player's understanding of something related to either the race or individual horses in the race. A layering UDM can literally be based on anything.
  Last Thurs I spent a day at DMR and ran into John Doyle. For those of you who may not be aware, John was the overall winner of the NHC tournament in Jan 2011. (No, John is not a JCapper guy.) He mostly uses the DRF and a ball point pen. 
  IMHO, if horse racing were chess John would be a grand master while the rest of us (myself included) would be local tournament players at best. It's both scary and amazing how quickly he subconsciously homes in on patterns. 
  Anyway, we came to a Mclm race with a Sadler FTS in it. John instantly knows that Sadler is 3 for 8 at DMR with FTS... win pct .375. Quick odds conversion = approx 8/5. 
  Conversely, John says that Sadler is 1 for 11 with FTS in SPLWT at DMR and wouldn't touch a Sadler FTS in a SPLWT at DMR with my money... we'll maybe with MY money! but definitely not his money.
  Me? I sense small sample syndrome at work. Not liking anything else in the race I pass. 
  John loads up on what he sees as very generous odds (7/2) and watches his Sadler FTS stalk the pace while in hand, pull even at about the 1/8th pole - and then win going away late. 
  Later that night over beers and dinner at a nice place, John and I are talking "shop." It occurs to me that in this case, even though he's not a JCapper guy - John had a "Layering" UDM in his head based on Sadler FTS's in MClm races at DMR. 
  It also occurs to me that John won the NHC and I didn't. That fact is not lost on me. 
  (It's not like I didn't have a good day myself. I did. But it's just like Chuck says... many ways to skin a cat.)
 
 
 
  Below are a couple of links to some older posts in the private section of the board where I laid out a few of my thoughts on Layering UDMs and how best to use them.
  JCapper Under The Hood - Jan, 2011: http://www.jcapper.com/MessageBoard/TopicReader.asp?topic=903&forum=Private
  Stunned by JCapper yet again - June, 2012: http://www.jcapper.com/MessageBoard/TopicReader.asp?topic=1277&forum=Private
 
 
 
  Wrapping this up... if that's even possible... the Layering UDM absolutely CAN be based on the small sample and the track specific. Some of mine are. 
  My best successes always seem to come about where Business UDM and Layering UDM meet.
 
  -jp
  .
 
 
  ~Edited by: jeff  on:  8/10/2012  at:  12:06:16 AM~
 
  |  Caveat 8/10/2012 11:30:03 AM | Thxs Charlie!! While I was at work yesterday, I took a peek to see if I got a response..I glance over it quickly and after I saw the charts that you put up.. then it hit me. At first when you mentioned back-fitting and the big picture , I wasn't sure what you were talking about and was shy to ask. Now, with those charts, I can see if I had built a UDM based on last years numbers..it would have been disastrous come 2012. 2011 showed average paced horses having the advantage...this year it would be early horses having the advantage.  You asked what could be cherry picked, I'm guessing maybe trainers, connections , post bias...others ..if the data shows up again on those factors Databases is completely new to me ..so please be patient :) Moving forward, I now have data on what happened way back and data on whats happening now , I would put an emphasis toward early  Jeff, thxs for taking the time to post a reply, I will get to your stuff..soon
  Mike
  Things have to sink in little by little
  ~Edited by: Caveat  on:  8/10/2012  at:  9:25:08 AM~
  Stupid me...I was thinking that there was only one page in JCP 101...cause I didnt find page numbers at the bottom... Theres a back button!!..WOW...Tons of reading :)
  ~Edited by: Caveat  on:  8/10/2012  at:  11:06:26 AM~
  ~Edited by: Caveat  on:  8/10/2012  at:  11:06:53 AM~
  ~Edited by: Caveat  on:  8/10/2012  at:  11:30:03 AM~
 
  |  Windoor 8/11/2012 9:47:02 PM | I will add my two cents on why one years results are (can be) so much different from one year to the other and what you might be able to do about it.
  Need I say it? All in my most humble opinion.
  I believe there are many different kinds of races other than simply Maiden, Claiming, Allowance, Stakes ,Handicap, and variations of same. Having said that, I believe they all can be broken down by  Track, Distance, Class, Age, Surface (including today's variant) Sex and time of year.
  I call this " The Seven" and is how I separate the different kinds of races. When you consider all of the sub categories for each and possible combinations of them, there are many, many and more to take into consideration. You now see just how complex our problem can be when deciding on what factors to use in our UDM's.
  Some factors can transcend many categories. Others, like some performance factors (speed and pace) can drastically change with the track conditions, and this can happen on a daily basis at some tracks. I tend to stay away from them (performance factors) even though they can indeed show a healthy win percent. It's the average odd I object to.
  So what changed from this year to last year than made such a drastic difference in the ROI? As mentioned above, it could be a simple track bias. It also could be the "type" of races being run. Maybe a lot more of "non winners of three" or one, or two, or conditional allowance races, etc. It can be many things. Knowing what is going to be the dominant factor today, for this "type" of race is the real challenge in my view.
  I now have Seven Key factors (only one is based on speed), Seven Primary  factors, and Seven Secondary factors. They all have value, but some really shine when  a specific "type" of race comes along. 
  I use to have a signature statement that says, " The Numbers Have Hinges". This is a reference to factors who's "Value" has change due to the type of race being run. A top ranked factor that used to give us a lot of winners, may now be nearly useless due to changes in the track surface or "types" of races being run. 
  I would recommend to anyone who is still struggling to maintain a profit, to break down their plays by the Seven. Pick one from each category, and build a UDM for it. Test it for a three year period (or more), or at least a few hundred consecutive races. Then support it with a small bank to see if it grows. It may very well work at other tracks, class levels, distances or any of the seven. Only a large database that can show enough consecutive plays can tell you if it has value or not. 
  Even then, there is no guarantee that it will work going forward. I start each with a very small bank, and only increase the wager when the bank has grown enough to support it. 
  Greed and impatience kills. The discipline to wait for them is also mandatory.
  Regards,
  Windoor.
 
  ~Edited by: Windoor  on:  8/11/2012  at:  9:45:37 PM~
  ~Edited by: Windoor  on:  8/11/2012  at:  9:47:02 PM~
 
  |  Charlie James 8/11/2012 11:21:08 PM | | --Quote: |  "
 
 So what changed from this year to last year than made such a drastic difference in the ROI? As mentioned above, it could be a simple track bias. It also could be the "type" of races being run. Maybe a lot more of "non winners of three" or one, or two, or conditional allowance races, etc. It can be many things. Knowing what is going to be the dominant factor today, for this "type" of race is the real challenge in my view.
  
  "
  --End Quote. |  
 
 
 
  Re: the bolded part -- Truer words never spoken. 
  Sharp post from start to finish.
  ~Edited by: Charlie James  on:  8/11/2012  at:  11:21:08 PM~
 
  |  Caveat 8/12/2012 7:48:06 AM | 
  Thxs guys...the knowledge is building :)
  Mike
 
  |  
     
 
  
   
   |