Batting Stats 7: wOBA Corrected

I think I found one of the culprits: Intentional Walks.  Our two guys, Sink and McDonough, have IBB’s in that season well over the average.  I queried the database to find their IBB’s as well as league average for those seasons:

SELECT b.player_id
  , concat(p.first_name, " ", p.last_name) as player 
  , l.abbr
  , b.year
  , b.league_id as league 
  , b.ibb
  , x.lg_avg_ibb
  , b.ibb-x.lg_avg_ibb as diff
FROM (  	
	SELECT b.year
	 , b.league_id
	 , round(avg(b.ibb),0) as lg_avg_ibb
	FROM CalcBatting b
	INNER JOIN players p ON b.player_id-p.player_id
	WHERE b.ab>200 AND p.position<>1 
	GROUP BY b.year, b.league_id
	) as x
  INNER JOIN CalcBatting b ON x.year=b.year AND x.league_id=b.league_id
  INNER JOIN players p ON b.player_id=p.player_id
  INNER JOIN leagues l ON b.league_id=l.league_id
 WHERE b.player_id IN (2574, 472) AND b.year=2015;

You’ll see that both were well above average:

Sink is about 1 standard deviation above average while McDonough is off the charts at almost 6.

I took a look at the 10 highest IBB variances to see if we can say with any certainty that we’re handling IBB’s incorrectly:

Looking at this, I think it’s pretty clear that if nothing else, IBB is at least contributing to the differences in wOBA by weighting it too highly.

It turns out that it’s not necessarily the weighting, but rather for formula for wOBA itself.  I removed IBB from the formula and recalculated wOBA for these 10 player-seasons:

Every single one of them improved.  Not yet perfect, but now that I have the outliers at least within the medium range, I feel comfortable moving on for real.  So, here’s the revised script for CalcBatting with the corrected formula for wOBA:

#Calculated batting stats for OOTP
    DROP TABLE IF EXISTS CalcBatting;
    CREATE TABLE IF NOT EXISTS CalcBatting AS

    SELECT b.year
    , b.league_id
    , b.player_id
    , b.stint #We can eventually move this down the list
    , b.split_id #We can eventually remove
    , b.team_id #We can eventually move this down the list
    , l.abbr as Lg
    , t.abbr as Team
    , b.g
    , b.ab
    , @PA := b.ab+b.bb+b.sh+b.sf+b.hp AS PA
    , b.r 
    , b.h
    , b.d
    , b.t
    , b.hr
    , b.rbi
    , b.sb
    , b.cs
    , b.bb
    , b.k
    , b.ibb
    , b.hp
    , b.sh
    , b.sf
    , b.gdp
    , b.ci
    , @BA := round(b.h/b.ab,3) AS ba
    , round(b.k/@PA,3) as krate
    , round((b.bb)/@PA,3) as bbrate
    , @OBP := round((b.h + b.bb + b.hp)/(@PA-b.sh-b.ci),3) AS obp
    , round(100*(@OBP/r.woba),0) as OBPplus
    , @SLG := round((b.h+b.d+2*b.t+3*b.hr)/b.ab,3) as slg
    , round(@OBP+@SLG,3) as ops
    , round(@SLG-@BA,3) as iso
    , round((b.h-b.hr)/(b.ab-b.k-b.hr+b.sf),3) as babip
    , round((r.wobaBB*(b.bb-b.ibb) + r.wobaHB*b.hp + r.woba1B*(b.h-b.d-b.t-b.hr) +
       r.woba2B*b.d + r.woba3B*b.t + r.wobaHR*b.hr)
       /(b.ab+b.bb-b.ibb+b.sf+b.hp),3) as woba
    /* NOT yet modified for OOTP and MySQL
    , round((([woba]-r.lgwoba)/r.wobascale)*[PA],1) as wRAA
    , round(((([woba]-r.lgwoba)/r.wobascale)+(l.totr/l.totpa))*[PA],0) as wRC
    , ((([wRAA]/[PA] + l.RperPA) + (l.RperPA - t.bpf*l.RperPA))/(lb.wRC/lb.PA))*100 AS [wRC+]
    */
    FROM 
      players_career_batting_stats b 
      INNER JOIN leagues l ON b.league_id=l.league_id 
      INNER JOIN teams t ON b.team_id=t.team_id
      INNER JOIN tblRunValues2 r ON b.year=r.year AND b.league_id=r.league_id
    WHERE b.ab<>0 AND b.split_id=1
    ORDER BY b.player_id, b.year

Leave a Reply

Your email address will not be published. Required fields are marked *