Pitching Stats 2: FIP – The False Start

FIP, or Fielding Independent Pitching, is based on the idea that pitchers are only in control of the “3 true outcomes” of a plate appearance: Strikeouts, Home Runs, and Free Passes (HBP and BB’s).  Everything else relies on defense which is largely beyond the pitcher’s control.  FIP is scaled, through the use of a constant, to a league’s ERA.

The formula to derive FIP is:

FIP = ((13*HR)+(3*(BB+HBP))-(2*K))/IP + constant

and the formula for the deriving the constant is similar:

FIP Constant = lgERA – (((13*lgHR)+(3*(lgBB+lgHBP))-(2*lgK))/lgIP)

We’re going to make a quick table to calculate the FIPConstant for each league year that we’ll reference when calculating FIP for each player stint.  Happily, the game gives us league ERA in the league_history_pitching_stats table, so we’ve been spared a step.  Because I am, apparently, not very good with the order of operations and parentheses, I have spent the last hour pulling my hair out trying to get a FIP Constant that looks reasonable.  In an attempt to save some of my last remaining hairs, I made a very inelegant table.  Behold my genius:

DROP TABLE IF EXISTS FIPConstant;
CREATE TABLE IF NOT EXISTS FIPConstant AS

SELECT
    lhps_id
    , year
    , league_id
    , level_id
    , hra
    , bb
    , hp
    , k
    , @HRAdj := 13*hra AS Adjusted_HR
    , @BBAdj := 3*bb AS Adjusted_BB
    , @HPAdj := 3*hp AS Adjusted_HP
    , @KAdj  := 2*k AS Adjusted_K
    , @InnPitch := ((ip*3)+ipf)/3 AS InnPitch
    , era
    , era - ((@HRAdj+@BBAdj+@HPAdj-@KAdj)/@InnPitch) AS FIPConstant
FROM league_history_pitching_stats;

On the CalcPitching table, we’re adding FIP and disregarding left/right splits for the moment.  Our table script now looks like this:

DROP TABLE IF EXISTS CalcPitching;
CREATE TABLE IF NOT EXISTS CalcPitching AS

SELECT
    i.player_id
    , i.year
    , i.stint
    , i.team_id
    , i.league_id
    , split_id
    , i.ip
    , i.ab
    , i.tb
    , i.ha
    , i.k
    , i.bf
    , i.rs
    , i.bb
    , i.r
    , i.er
    , i.gb
    , i.fb
    , i.pi
    , i.ipf
    , i.g
    , i.gs
    , i.w
    , i.l
    , i.s
    , i.sa
    , i.da
    , i.sh
    , i.sf
    , i.ta
    , i.hra
    , i.bk
    , i.ci
    , i.iw
    , i.wp
    , i.hp
    , i.gf
    , i.dp
    , i.qs
    , i.svo
    , i.bs
    , i.ra
    , i.cg
    , i.sho
    , i.sb
    , i.cs
    , i.hld
    , i.ir
    , i.irs
    , i.wpa
    , i.li
    , i.outs
    , i.war
    , @InnPitch := ((3*ip)+ipf)/3 AS InnPitch
    , round((9*i.k)/@InnPitch,1) AS 'k9'
    , round((9*i.bb)/@InnPitch,1) AS 'bb9'
    , round((9*i.hra)/@InnPitch,1) AS 'HR9'
    , round((i.bb+i.ha)/@InnPitch,2) AS WHIP
    , round(i.k/i.bb,2) AS 'K/BB'
    , i.gb/i.fb AS 'gb/fb'
    , round((i.ha-i.hra)/(i.ab-i.k-i.hra-i.sh+i.sf),3) AS BABIP
    , round(i.er/@InnPitch,2) AS ERA
    , round(((13*i.hra)+(3*(i.bb+i.hp))-(2*i.k))/@InnPitch+f.FIPConstant,2) AS fip
    
    
FROM players_career_pitching_stats AS i
    INNER JOIN FIPConstant AS f ON i.year=f.year AND i.league_id=f.league_id
WHERE i.split_id=1;

So, how did it go?  Not great.  I took a random sample from my database and compared it to the game’s generated stats.  I wanted my FIP calculations to be within .05 of the game’s.

While most were in the medium range, it seems that there’s something different in the way the game calculates FIP.  Our numbers are close enough that it can’t be a major difference.  I’m going to follow a hunch and guess that it’s Hit By Pitch.  I will remove HBP as a factor in both the FIPConstant and FIP calculations and see what that does to our results.

I got about a third of the way through the revised calcs when I noticed a problem with the FIPConstant table.  This table pulls data from the league_history_pitching_stats table.  The problem is there.  You see, as I mentioned in the table setup posts and then promptly forgot about, there are a couple of columns in the league_history tables that attempt to distinguish between subleagues but do not give any indication of which is which. (They are the team_id and game_id columns.)  What this does create two records for each league (one for each subleague) with different totals but no way to identify the subleague being referenced.  This is no good.

My new hunch is that HBP is not the issue.  The formula is probably fine, I will just have to change the FIPConstant table to sum data from players_career_pitching_stats.  I’m going to publish this post as a testament to my naiveté and get to work on the revised table.

Tables 11: Final Thoughts on OOTP Tables

Well, the easy part is over. There are a couple of unresolved issues that I am looking forward to settling. The main one of these being figuring what and how to use the ‘team_id’ and ‘game_id’ fields in the League History Tables. Figuring that out will allow me to separate some of the adjusted stats to the sub_league level. Without resolving this, I would have to join the players’ team_id to a league_id-sub_league_id pair on the team_relations table and then group by that pair. And since sub_league_id’s are not unique it would be a little trickier. I can’t see it working in my head at the moment, though I am sure an hour or two at the whiteboard could see me through. Still, not for now.

There will be other tables added as my little project grows wings, but these are all I need right now to start figuring out how to calculate stats.

Tables 3: The League History Tables.

Edited based on some really useful tips from the OOTP Community’s resident MySQL whisperer

Where in Lahman, you have to build your own views for league statistics by season, OOTP does that for you.  This is nice.  I have a couple issues with the some of the columns in these tables and I will explain them as we build them.  Also, here’s where I started adding AUTO_INCREMENT ID fields to these tables.  I tried playing around with massive, composite PK’s and it wasn’t good.

Creating the league_history table

So, I thought this table was going to be useful when I configured the MySQL dump in-game.  Turns out, it’s not all that useful for my purposes.  It just identifies the award winners for each year in each league and sub_league.  More out of laziness than anything else, I am keeping it in the schema:

CREATE TABLE `league_history` (
  `league_id` int(11) NOT NULL,
  `sub_league_id` int(11) NOT NULL,
  `year` int(11) NOT NULL,
  `best_hitter_id` int(11) DEFAULT NULL,
  `best_pitcher_id` int(11) DEFAULT NULL,
  `best_rookie_id` int(11) DEFAULT NULL,
  `best_manager_id` int(11) DEFAULT NULL,
  `best_fielder_id0` int(11) DEFAULT NULL,
  `best_fielder_id1` int(11) DEFAULT NULL,
  `best_fielder_id2` int(11) DEFAULT NULL,
  `best_fielder_id3` int(11) DEFAULT NULL,
  `best_fielder_id4` int(11) DEFAULT NULL,
  `best_fielder_id5` int(11) DEFAULT NULL,
  `best_fielder_id6` int(11) DEFAULT NULL,
  `best_fielder_id7` int(11) DEFAULT NULL,
  `best_fielder_id8` int(11) DEFAULT NULL,
  `best_fielder_id9` int(11) DEFAULT NULL,
  PRIMARY KEY (`league_id`,`sub_league_id`,`year`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

 

Create league_history_batting_stats table

There are a couple puzzles in this and the next two tables.  The columns ‘game_id’ and ‘team_id’ are hard to figure out.  In fact, I have not figured them out.  Their purpose, I’m sure, is to identify the sub_league.  Unfortunately, it doesn’t use a column called ‘sub_league’.  I haven’t been able to find any table that links either of these columns to sub_league.  So, there’s not much we can do with them.  Moreover, we’ll still have to create a view later on that sums the totals for each ‘team_id’ and ‘game_id’ pair.  I even posted about this on the OOTP Boards, but no one has chimed in yet.  Wishing doesn’t make it true.  Those columns most likely do not represent usable data.

‘split_id’ refers to stats accumulated against all pitchers, just righties, just lefties.  This is important at the player level.  At the league level, though, all split_id=0.

Also, this table conveniently tracks singles, ‘s’, as a column.  Would be nice if it did the same at the player level to save us the calculation in some statistics, but oh well.

Note that I added an AUTO_INCREMENT ID as PK and indexed ‘year’, ‘league_id’, and ‘split_id’.

CREATE TABLE `league_history_batting_stats` (
  `lhbs_id` int(11) NOT NULL AUTO_INCREMENT,    
  `year` smallint(6) NOT NULL,
  `team_id` int(11) NOT NULL,
  `game_id` int(11) DEFAULT NULL,
  `league_id` int(11) NOT NULL,
  `level_id` smallint(6) DEFAULT NULL,
  `split_id` smallint(6) DEFAULT NULL,
  `pa` int(11) DEFAULT NULL,
  `ab` int(11) DEFAULT NULL,
  `h` int(11) DEFAULT NULL,
  `k` int(11) DEFAULT NULL,
  `tb` int(11) DEFAULT NULL,
  `s` int(11) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  `t` int(11) DEFAULT NULL,
  `hr` int(11) DEFAULT NULL,
  `sb` int(11) DEFAULT NULL,
  `cs` int(11) DEFAULT NULL,
  `rbi` int(11) DEFAULT NULL,
  `r` int(11) DEFAULT NULL,
  `bb` int(11) DEFAULT NULL,
  `ibb` int(11) DEFAULT NULL,
  `hp` int(11) DEFAULT NULL,
  `sh` int(11) DEFAULT NULL,
  `sf` int(11) DEFAULT NULL,
  `ci` int(11) DEFAULT NULL,
  `gdp` int(11) DEFAULT NULL,
  `g` int(11) DEFAULT NULL,
  `gs` int(11) DEFAULT NULL,
  `ebh` int(11) DEFAULT NULL,
  `pitches_seen` int(11) DEFAULT NULL,
  `avg` double DEFAULT NULL,
  `obp` double DEFAULT NULL,
  `slg` double DEFAULT NULL,
  `rc` double DEFAULT NULL,
  `rc27` double DEFAULT NULL,
  `iso` double DEFAULT NULL,
  `woba` double DEFAULT NULL,
  `ops` double DEFAULT NULL,
  `sbp` double DEFAULT NULL,
  `ws` double DEFAULT NULL,
  `kp` double DEFAULT NULL,
  `bbp` double DEFAULT NULL,
  `wpa` double DEFAULT NULL,
  `babip` double DEFAULT NULL,
  PRIMARY KEY (`lhbs_id`),
  INDEX `lhbs_ix1` (`year`),
  INDEX `lhbs_ix2` (`league_id`),
  INDEX `lhbs_ix3` (`split_id`)    
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

 

Create  league_history_fielding_stats table

Similar to the above, but for fielding stats at league level.  I haven’t yet dug into what fielding stats I will be calculating, so I don’t have all that much to say about this table.

CREATE TABLE `league_history_fielding_stats` (
  `lhfs_id` int(11) NOT NULL AUTO_INCREMENT,    
  `year` smallint(6) NOT NULL,
  `team_id` int(11) NOT NULL,
  `league_id` int(11) NOT NULL,
  `sub_league_id` int(11) DEFAULT NULL,
  `level_id` smallint(6) DEFAULT NULL,
  `split_id` smallint(6) NOT NULL,
  `position` smallint(6) DEFAULT NULL,
  `g` int(11) DEFAULT NULL,
  `gs` int(11) DEFAULT NULL,
  `tc` int(11) DEFAULT NULL,
  `a` int(11) DEFAULT NULL,
  `po` int(11) DEFAULT NULL,
  `e` int(11) DEFAULT NULL,
  `dp` int(11) DEFAULT NULL,
  `tp` int(11) DEFAULT NULL,
  `pb` int(11) DEFAULT NULL,
  `sba` int(11) DEFAULT NULL,
  `rto` int(11) DEFAULT NULL,
  `er` int(11) DEFAULT NULL,
  `ip` int(11) DEFAULT NULL,
  `ipf` int(11) DEFAULT NULL,
  `pct` double DEFAULT NULL,
  `range` double DEFAULT NULL,
  `rtop` double DEFAULT NULL,
  `cera` double DEFAULT NULL,
  `zr` double DEFAULT NULL,
  `plays` int(11) DEFAULT NULL,
  `plays_base` int(11) DEFAULT NULL,
  `roe` int(11) DEFAULT NULL,
  `eff` int(11) DEFAULT NULL,
  PRIMARY KEY (`lhfs_id`),
  INDEX `lhfs_ix1` (`year`),
  INDEX `lhfs_ix2` (`league_id`),
  INDEX `lhfs_ix3` (`split_id`) 
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

 

Create league_history_pitching_stats table

This table may come in handy at some point – I haven’t done a lot of work on pitching stats yet.  But it does NOT come in handy for establishing a run environment for a league.  Why?  Because you can’t determine the number of outs!  It’s strange – as the player table has an outs column.  This table has an IP column, which would be OK if it weren’t an integer.  IP is usually tracked with a baseball-specific decimal number (x.y) where x is the number of complete innings pitched and y (0, 1, 2)  represent the outs of incomplete innings.  What are you supposed to do with only the integer?  Yes, well, apparently the column ‘ipf’  stands for Innings Pitched Fraction.  So, Outs = (3*IP) + IPF.  Anyway, I am sure I will find plenty of uses for this table in the future:

CREATE TABLE `league_history_pitching_stats` (
  `lhps_id` int(11)  NOT NULL AUTO_INCREMENT,    
  `year` smallint(6) NOT NULL,
  `team_id` int(11) NOT NULL,
  `game_id` int(11) DEFAULT NULL,
  `league_id` int(11) NOT NULL,
  `level_id` smallint(6) DEFAULT NULL,
  `split_id` smallint(6) DEFAULT NULL,
  `ab` int(11) DEFAULT NULL,
  `ip` int(11) DEFAULT NULL,
  `bf` int(11) DEFAULT NULL,
  `tb` int(11) DEFAULT NULL,
  `ha` int(11) DEFAULT NULL,
  `k` int(11) DEFAULT NULL,
  `rs` int(11) DEFAULT NULL,
  `bb` int(11) DEFAULT NULL,
  `r` int(11) DEFAULT NULL,
  `er` int(11) DEFAULT NULL,
  `gb` int(11) DEFAULT NULL,
  `fb` int(11) DEFAULT NULL,
  `pi` int(11) DEFAULT NULL,
  `ipf` int(11) DEFAULT NULL,
  `g` int(11) DEFAULT NULL,
  `gs` int(11) DEFAULT NULL,
  `w` int(11) DEFAULT NULL,
  `l` int(11) DEFAULT NULL,
  `s` int(11) DEFAULT NULL,
  `sa` int(11) DEFAULT NULL,
  `da` int(11) DEFAULT NULL,
  `sh` int(11) DEFAULT NULL,
  `sf` int(11) DEFAULT NULL,
  `ta` int(11) DEFAULT NULL,
  `hra` int(11) DEFAULT NULL,
  `bk` int(11) DEFAULT NULL,
  `ci` int(11) DEFAULT NULL,
  `iw` int(11) DEFAULT NULL,
  `wp` int(11) DEFAULT NULL,
  `hp` int(11) DEFAULT NULL,
  `gf` int(11) DEFAULT NULL,
  `dp` int(11) DEFAULT NULL,
  `qs` int(11) DEFAULT NULL,
  `svo` int(11) DEFAULT NULL,
  `bs` int(11) DEFAULT NULL,
  `ra` int(11) DEFAULT NULL,
  `ir` int(11) DEFAULT NULL,
  `irs` int(11) DEFAULT NULL,
  `cg` int(11) DEFAULT NULL,
  `sho` int(11) DEFAULT NULL,
  `sb` int(11) DEFAULT NULL,
  `cs` int(11) DEFAULT NULL,
  `hld` int(11) DEFAULT NULL,
  `r9` double DEFAULT NULL,
  `avg` double DEFAULT NULL,
  `obp` double DEFAULT NULL,
  `slg` double DEFAULT NULL,
  `ops` double DEFAULT NULL,
  `h9` double DEFAULT NULL,
  `k9` double DEFAULT NULL,
  `hr9` double DEFAULT NULL,
  `bb9` double DEFAULT NULL,
  `cgp` double DEFAULT NULL,
  `fip` double DEFAULT NULL,
  `qsp` double DEFAULT NULL,
  `winp` double DEFAULT NULL,
  `rsg` double DEFAULT NULL,
  `svp` double DEFAULT NULL,
  `bsvp` double DEFAULT NULL,
  `irsp` double DEFAULT NULL,
  `gfp` double DEFAULT NULL,
  `era` double DEFAULT NULL,
  `pig` double DEFAULT NULL,
  `ws` double DEFAULT NULL,
  `whip` double DEFAULT NULL,
  `gbfbp` double DEFAULT NULL,
  `kbb` double DEFAULT NULL,
  `babip` double DEFAULT NULL,
  `wpa` double DEFAULT NULL,
  PRIMARY KEY (`lhps_id`),
  INDEX `lhps_ix1` (`year`),
  INDEX `lhps_ix2` (`league_id`),
  INDEX `lhps_ix3` (`split_id`)     
  ) ENGINE=InnoDB DEFAULT CHARSET=latin1;