It’s that time, once again, to try to deal with park adjusted stats. Again, and against counsel, I will be pulling the park factors from the teams table rather than doing the calculations myself. I got within spitting distance of a good result set for wRC+, so I am hoping for similar with these park-adjusted pitching stats.
First up is ERA-. ERA- takes a pitcher’s ERA and puts it in the context of his league and his home park. This makes it possible to compare players across eras and leagues, essentially normalizing the data. 100 is league average. Every point below 100 is 1 percent better than average.
The formula is pretty straight-forward:
ERA Minus = 100*((ERA + (ERA – ERA*(PF/100)) )/ AL or NL ERA)
A few things have to happen in order to run this calc. First, we’ll need sub-league ERA’s. As mentioned in the first FIP post, we sort of do but really don’t have this on the league_history_table. Better to roll our own from players_career_pitching_stats table. We’ll do this in the same manner that we did it for batting- joining to the team relations table to get subleague.
Here’s how:
DROP TABLE IF EXISTS sub_league_history_pitching;
CREATE TABLE IF NOT EXISTS sub_league_history_pitching AS
SELECT
year
, league_id
, sub_league_id
, round((totER/totIP)*9,2) AS slgERA
FROM (
SELECT p.year
, p.league_id
, t.sub_league_id
, ((sum(ip)*3)+sum(ipf))/3 AS totIP
, sum(er) AS totER
FROM CalcPitching AS p INNER JOIN team_relations AS t ON p.team_id=t.team_id
GROUP BY year, league_id, sub_league_id
) AS x ;
Before we move on to the park factor, we have to make sure that we can associate a player’s team with his sub-league. As usual, I’m sure that there’s a more elegant way to go about this than where I landed. The problem I needed to solve was that sub-leagues do not have unique identifiers; they are uniquely identified only as composites of league_id and sub_league_id. So, it’s not enough to refer to a sub-league as sub-league-1. There are as many sub-league-1’s as there are leagues. To make matters more complicated, the teams table does not carry a sub-league field. That’s why we had to refer to the team_relations table. Unfortunately, the team_relations table is the only table that contains all three necessary data points to pin down a team/sub-league relationship. When I tried to let the database do the thinking for me by joining to it, it wasn’t consistently choosing the correct sub-league for each team.
I decided to add sub-league as a field to the already-crowded CalcPitching table. It worked in testing, correctly pulling the right slgERA for each league-sub_league-year. Like I said, I bet there’s a way to do this only with joins, but I wasn’t able to figure it out. I am going to go back to the CalcBatting table and do the same thing. Here’s the code for the new joins:
INNER JOIN team_relations AS r ON i.team_id=r.team_id AND i.league_id=r.league_id
INNER JOIN sub_league_history_pitching AS slg ON i.year=slg.year AND i.league_id=slg.league_id AND r.sub_league_id=slg.sub_league_id
The next thing is to return the park factor for each pitcher-stint-year. We’ll do this by joining to the teams table, then to the parks table:
INNER JOIN teams AS t ON i.team_id=t.team_id
INNER JOIN parks AS p ON t.park_id=p.park_id
With all that done, we’ve got to go back and define ERA as a variable so that we can reference it here without elaborating it. Then, the formula is simple. OOTP doesn’t track this stat either, so it’s hard to say with any certainty how well this works or how badly I’m getting bad results from using hard-coded park factors. I did a quick sniff test, looking at ranges of ERA’s in my league and sniffing the ERA- stats for each. It looks OK, I guess?
OOTP uses ERA+ instead, which seems to be more or less the same stat scaled up from 100 rather than down. I will tackle that one next.
Here’s the full script for CalcPitching so far:
DROP TABLE IF EXISTS CalcPitching;
CREATE TABLE IF NOT EXISTS CalcPitching AS
SELECT
i.player_id
, i.year
, i.stint
, i.team_id
, i.league_id
, r.sub_league_id
, split_id
, i.ip
, i.ab
, i.tb
, i.ha
, i.k
, i.bf
, i.rs
, i.bb
, i.r
, i.er
, i.gb
, i.fb
, i.pi
, i.ipf
, i.g
, i.gs
, i.w
, i.l
, i.s
, i.sa
, i.da
, i.sh
, i.sf
, i.ta
, i.hra
, i.bk
, i.ci
, i.iw
, i.wp
, i.hp
, i.gf
, i.dp
, i.qs
, i.svo
, i.bs
, i.ra
, i.cg
, i.sho
, i.sb
, i.cs
, i.hld
, i.ir
, i.irs
, i.wpa
, i.li
, i.outs
, i.war
, @InnPitch := ((3*ip)+ipf)/3 AS InnPitch
, round((9*i.k)/@InnPitch,1) AS 'k9'
, round((9*i.bb)/@InnPitch,1) AS 'bb9'
, round((9*i.hra)/@InnPitch,1) AS 'HR9'
, round((i.bb+i.ha)/@InnPitch,2) AS WHIP
, round(i.k/i.bb,2) AS 'K/BB'
, i.gb/i.fb AS 'gb/fb'
, round((i.ha-i.hra)/(i.ab-i.k-i.hra-i.sh+i.sf),3) AS BABIP
, @ERA := round((i.er/@InnPitch)*9,2) AS ERA
, round(((13*i.hra)+(3*(i.bb+i.hp))-(2*i.k))/@InnPitch+f.FIPConstant,2) AS FIP
, round(((13*(i.fb*f.hr_fb_pct))+(3*(i.bb+i.hp))-(2*i.k))/@InnPitch+f.FIPConstant,2) AS xFIP
, round(100*((@ERA + (@ERA - @ERA*(p.avg)))/slg.slgERA),0) AS ERAminus
FROM players_career_pitching_stats AS i
INNER JOIN team_relations AS r ON i.team_id=r.team_id AND i.league_id=r.league_id
INNER JOIN FIPConstant AS f ON i.year=f.year AND i.league_id=f.league_id
INNER JOIN sub_league_history_pitching AS slg ON i.year=slg.year AND i.league_id=slg.league_id AND r.sub_league_id=slg.sub_league_id
INNER JOIN teams AS t ON i.team_id=t.team_id
INNER JOIN parks AS p ON t.park_id=p.park_id
WHERE i.split_id=1 AND i.league_id<>0;