Batting Stats 6: Follow-up on wOBA

I wasn’t totally comfortable with the testing results for wOBA.  I didn’t change the calculation, but I did change my testing methodology.  The first time around was totally un-scientific:  I typed random player_id’s into a query, looked for stats that felt ‘good’ (whatever that means), grabbed their names and searched for the stats in the game.

I revisited my method several times.  Since my biggest misses in the last test were with small sample sizes and, since I don’t really care about stats with small sample sizes, I limited my sample to player-years with at least 200 AB’s.  Perhaps redundantly, I removed pitchers from the sample as well.  Thinking that there would be an easy way to grab the stats from OOTP, I limited the sample to the last two game years. (There’s not, really.)  Finally, rather than cherry picking records that passed some indefinable sniff-test, I assigned a random number to each record, sorted my results by that random number, and limited the result set to 30.  Then, I went into the game and looked up the stats for those player-years.

Here’s the query that did that:

SELECT rand() as sorted
  , p.player_id
  , concat(first_name, " ", last_name) as player
  , b.year
  , l.abbr 
  , b.ab
  , b.woba
FROM players p
  INNER JOIN CalcBatting b ON p.player_id=b.player_id
  INNER JOIN leagues l ON b.league_id=l.league_id
WHERE p.position<>1 AND b.year>=2014 AND b.ab>200
ORDER BY sorted
LIMIT 30;

I am much more comfortable with this result set:

77% were within range and 3 of the 5 that were medium were within 3 points of being good.  The two bad ones were really bad, though.  I will take a look at those particular seasons to see if I can find any stats that are out of normal range.  My thought is that somehow I am weighting a less-common stat (such as hit by pitch) incorrectly.  For a normal season, if HBP is weighted wrong but only occurs 5 times in 600 plate appearances, no big deal.  But if a guy got plunked 50 times or something, it could really throw off the calculation.  That’s my working theory going in at least.  If I find something weird, I will report back.  Otherwise, I’m going to move on.

Leave a Reply

Your email address will not be published. Required fields are marked *