Pitching Stats 5: ERA-

It’s that time, once again, to try to deal with park adjusted stats.  Again, and against counsel, I will be pulling the park factors from the teams table rather than doing the calculations myself.  I got within spitting distance of a good result set for wRC+, so I am hoping for similar with these park-adjusted pitching stats.

First up is ERA-.  ERA- takes a pitcher’s ERA and puts it in the context of his league and his home park.  This makes it possible to compare players across eras and leagues, essentially normalizing the data.  100 is league average.  Every point below 100 is 1 percent better than average.

The formula is pretty straight-forward:
ERA Minus = 100*((ERA + (ERA – ERA*(PF/100)) )/ AL or NL ERA)

A few things have to happen in order to run this calc.  First, we’ll need sub-league ERA’s.  As mentioned in the first FIP post, we sort of do but really don’t have this on the league_history_table.  Better to roll our own from players_career_pitching_stats table.  We’ll do this in the same manner that we did it for batting- joining to the team relations table to get subleague.

Here’s how:

DROP TABLE IF EXISTS sub_league_history_pitching;
CREATE TABLE IF NOT EXISTS sub_league_history_pitching AS

SELECT
       year
     , league_id
     , sub_league_id
     , round((totER/totIP)*9,2) AS slgERA 
FROM  (        
     SELECT p.year
          , p.league_id
          , t.sub_league_id
          , ((sum(ip)*3)+sum(ipf))/3 AS totIP
          , sum(er) AS totER
     FROM CalcPitching AS p INNER JOIN team_relations AS t ON p.team_id=t.team_id
     GROUP BY year, league_id, sub_league_id
      ) AS x ;

Before we move on to the park factor, we have to make sure that we can associate a player’s team with his sub-league.  As usual, I’m sure that there’s a more elegant way to go about this than where I landed.  The problem I needed to solve was that sub-leagues do not have unique identifiers; they are uniquely identified only as composites of league_id and sub_league_id.  So, it’s not enough to refer to a sub-league as sub-league-1.  There are as many sub-league-1’s as there are leagues.  To make matters more complicated, the teams table does not carry a sub-league field.  That’s why we had to refer to the team_relations table.  Unfortunately, the team_relations table is the only table that contains all three necessary data points to pin down a team/sub-league relationship.  When I tried to let the database do the thinking for me by joining to it, it wasn’t consistently choosing the correct sub-league for each team.

I decided to add sub-league as a field to the already-crowded CalcPitching table.  It worked in testing, correctly pulling the right slgERA for each league-sub_league-year.  Like I said, I bet there’s a way to do this only with joins, but I wasn’t able to figure it out.  I am going to go back to the CalcBatting table and do the same thing.  Here’s the code for the new joins:

INNER JOIN team_relations AS r ON i.team_id=r.team_id AND i.league_id=r.league_id
INNER JOIN sub_league_history_pitching AS slg ON i.year=slg.year AND i.league_id=slg.league_id AND r.sub_league_id=slg.sub_league_id

The next thing is to return the park factor for each pitcher-stint-year.  We’ll do this by joining to the teams table, then to the parks table:

INNER JOIN teams AS t ON i.team_id=t.team_id
INNER JOIN parks AS p ON t.park_id=p.park_id

With all that done, we’ve got to go back and define ERA as a variable so that we can reference it here without elaborating it.  Then, the formula is simple.  OOTP doesn’t track this stat either, so it’s hard to say with any certainty how well this works or how badly I’m getting bad results from using hard-coded park factors.  I did a quick sniff test, looking at ranges of ERA’s in my league and sniffing the ERA- stats for each.  It looks OK, I guess?

OOTP uses ERA+ instead, which seems to be more or less the same stat scaled up from 100 rather than down.  I will tackle that one next.

Here’s the full script for CalcPitching so far:

DROP TABLE IF EXISTS CalcPitching;
CREATE TABLE IF NOT EXISTS CalcPitching AS

SELECT
    i.player_id
    , i.year
    , i.stint
    , i.team_id
    , i.league_id
    , r.sub_league_id
    , split_id
    , i.ip
    , i.ab
    , i.tb
    , i.ha
    , i.k
    , i.bf
    , i.rs
    , i.bb
    , i.r
    , i.er
    , i.gb
    , i.fb
    , i.pi
    , i.ipf
    , i.g
    , i.gs
    , i.w
    , i.l
    , i.s
    , i.sa
    , i.da
    , i.sh
    , i.sf
    , i.ta
    , i.hra
    , i.bk
    , i.ci
    , i.iw
    , i.wp
    , i.hp
    , i.gf
    , i.dp
    , i.qs
    , i.svo
    , i.bs
    , i.ra
    , i.cg
    , i.sho
    , i.sb
    , i.cs
    , i.hld
    , i.ir
    , i.irs
    , i.wpa
    , i.li
    , i.outs
    , i.war
    , @InnPitch := ((3*ip)+ipf)/3 AS InnPitch
    , round((9*i.k)/@InnPitch,1) AS 'k9'
    , round((9*i.bb)/@InnPitch,1) AS 'bb9'
    , round((9*i.hra)/@InnPitch,1) AS 'HR9'
    , round((i.bb+i.ha)/@InnPitch,2) AS WHIP
    , round(i.k/i.bb,2) AS 'K/BB'
    , i.gb/i.fb AS 'gb/fb'
    , round((i.ha-i.hra)/(i.ab-i.k-i.hra-i.sh+i.sf),3) AS BABIP
    , @ERA := round((i.er/@InnPitch)*9,2) AS ERA
    , round(((13*i.hra)+(3*(i.bb+i.hp))-(2*i.k))/@InnPitch+f.FIPConstant,2) AS FIP 
    , round(((13*(i.fb*f.hr_fb_pct))+(3*(i.bb+i.hp))-(2*i.k))/@InnPitch+f.FIPConstant,2) AS xFIP
    , round(100*((@ERA + (@ERA - @ERA*(p.avg)))/slg.slgERA),0) AS ERAminus
      
FROM players_career_pitching_stats AS i
    INNER JOIN team_relations AS r ON i.team_id=r.team_id AND i.league_id=r.league_id
    INNER JOIN FIPConstant AS f ON i.year=f.year AND i.league_id=f.league_id
    INNER JOIN sub_league_history_pitching AS slg ON i.year=slg.year AND i.league_id=slg.league_id AND r.sub_league_id=slg.sub_league_id
    INNER JOIN teams AS t ON i.team_id=t.team_id
    INNER JOIN parks AS p ON t.park_id=p.park_id
WHERE i.split_id=1 AND i.league_id<>0;

 

Pitching Stats 3: FIP – The Conclusion

I redid the FIPConstant table to pull summed data from the players_career_pitching_stats table.  That table now looks like this:

DROP TABLE IF EXISTS FIPConstant;
CREATE TABLE IF NOT EXISTS FIPConstant AS

SELECT
      year
    , league_id
    , hra_totals
    , bb_totals
    , hp_totals
    , k_totals
    , er_totals
    , ip_totals
    , ipf_totals
    , @HRAdj := 13*hra_totals AS Adjusted_HR
    , @BBAdj := 3*bb_totals AS Adjusted_BB
    , @HPAdj := 3*hp_totals AS Adjusted_HP
    , @KAdj  := 2*k_totals AS Adjusted_K
    , @InnPitch := ((ip_totals*3)+ipf_totals)/3 AS InnPitch
    , @lgERA := round((er_totals/@InnPitch)*9,2) AS lgERA
    , round(@lgERA - ((@HRAdj+@BBAdj+@HPAdj-@KAdj)/@InnPitch),2) AS FIPConstant
FROM (
         SELECT year
                , league_id
                , sum(hra) as hra_totals
                , sum(bb) as bb_totals
                , sum(hp) as hp_totals
                , sum(k) as k_totals
                , sum(er) as er_totals
                , sum(ip) as ip_totals
                , sum(ipf) as ipf_totals
          FROM players_career_pitching_stats
          GROUP BY year, league_id
      ) AS x;

And how did it work?  Better.

9 within 0.05; 26 within 0.11.  I’m still curious as to why I’m not matching up even better.  I still have a lingering suspicion that HBP is behind this, but I am going to let it lie for now unless it comes back to bite me on other calculations.

Our CalcPitching table to this point:

DROP TABLE IF EXISTS CalcPitching;
CREATE TABLE IF NOT EXISTS CalcPitching AS

SELECT
    i.player_id
    , i.year
    , i.stint
    , i.team_id
    , i.league_id
    , split_id
    , i.ip
    , i.ab
    , i.tb
    , i.ha
    , i.k
    , i.bf
    , i.rs
    , i.bb
    , i.r
    , i.er
    , i.gb
    , i.fb
    , i.pi
    , i.ipf
    , i.g
    , i.gs
    , i.w
    , i.l
    , i.s
    , i.sa
    , i.da
    , i.sh
    , i.sf
    , i.ta
    , i.hra
    , i.bk
    , i.ci
    , i.iw
    , i.wp
    , i.hp
    , i.gf
    , i.dp
    , i.qs
    , i.svo
    , i.bs
    , i.ra
    , i.cg
    , i.sho
    , i.sb
    , i.cs
    , i.hld
    , i.ir
    , i.irs
    , i.wpa
    , i.li
    , i.outs
    , i.war
    , @InnPitch := ((3*ip)+ipf)/3 AS InnPitch
    , round((9*i.k)/@InnPitch,1) AS 'k9'
    , round((9*i.bb)/@InnPitch,1) AS 'bb9'
    , round((9*i.hra)/@InnPitch,1) AS 'HR9'
    , round((i.bb+i.ha)/@InnPitch,2) AS WHIP
    , round(i.k/i.bb,2) AS 'K/BB'
    , i.gb/i.fb AS 'gb/fb'
    , round((i.ha-i.hra)/(i.ab-i.k-i.hra-i.sh+i.sf),3) AS BABIP
    , round((i.er/@InnPitch)*9,2) AS ERA
    , round(((13*i.hra)+(3*(i.bb+i.hp))-(2*i.k))/@InnPitch+f.FIPConstant,2) AS fip
    
    
FROM players_career_pitching_stats AS i
    INNER JOIN FIPConstant AS f ON i.year=f.year AND i.league_id=f.league_id
WHERE i.split_id=1;

Pitching Stats 2: FIP – The False Start

FIP, or Fielding Independent Pitching, is based on the idea that pitchers are only in control of the “3 true outcomes” of a plate appearance: Strikeouts, Home Runs, and Free Passes (HBP and BB’s).  Everything else relies on defense which is largely beyond the pitcher’s control.  FIP is scaled, through the use of a constant, to a league’s ERA.

The formula to derive FIP is:

FIP = ((13*HR)+(3*(BB+HBP))-(2*K))/IP + constant

and the formula for the deriving the constant is similar:

FIP Constant = lgERA – (((13*lgHR)+(3*(lgBB+lgHBP))-(2*lgK))/lgIP)

We’re going to make a quick table to calculate the FIPConstant for each league year that we’ll reference when calculating FIP for each player stint.  Happily, the game gives us league ERA in the league_history_pitching_stats table, so we’ve been spared a step.  Because I am, apparently, not very good with the order of operations and parentheses, I have spent the last hour pulling my hair out trying to get a FIP Constant that looks reasonable.  In an attempt to save some of my last remaining hairs, I made a very inelegant table.  Behold my genius:

DROP TABLE IF EXISTS FIPConstant;
CREATE TABLE IF NOT EXISTS FIPConstant AS

SELECT
    lhps_id
    , year
    , league_id
    , level_id
    , hra
    , bb
    , hp
    , k
    , @HRAdj := 13*hra AS Adjusted_HR
    , @BBAdj := 3*bb AS Adjusted_BB
    , @HPAdj := 3*hp AS Adjusted_HP
    , @KAdj  := 2*k AS Adjusted_K
    , @InnPitch := ((ip*3)+ipf)/3 AS InnPitch
    , era
    , era - ((@HRAdj+@BBAdj+@HPAdj-@KAdj)/@InnPitch) AS FIPConstant
FROM league_history_pitching_stats;

On the CalcPitching table, we’re adding FIP and disregarding left/right splits for the moment.  Our table script now looks like this:

DROP TABLE IF EXISTS CalcPitching;
CREATE TABLE IF NOT EXISTS CalcPitching AS

SELECT
    i.player_id
    , i.year
    , i.stint
    , i.team_id
    , i.league_id
    , split_id
    , i.ip
    , i.ab
    , i.tb
    , i.ha
    , i.k
    , i.bf
    , i.rs
    , i.bb
    , i.r
    , i.er
    , i.gb
    , i.fb
    , i.pi
    , i.ipf
    , i.g
    , i.gs
    , i.w
    , i.l
    , i.s
    , i.sa
    , i.da
    , i.sh
    , i.sf
    , i.ta
    , i.hra
    , i.bk
    , i.ci
    , i.iw
    , i.wp
    , i.hp
    , i.gf
    , i.dp
    , i.qs
    , i.svo
    , i.bs
    , i.ra
    , i.cg
    , i.sho
    , i.sb
    , i.cs
    , i.hld
    , i.ir
    , i.irs
    , i.wpa
    , i.li
    , i.outs
    , i.war
    , @InnPitch := ((3*ip)+ipf)/3 AS InnPitch
    , round((9*i.k)/@InnPitch,1) AS 'k9'
    , round((9*i.bb)/@InnPitch,1) AS 'bb9'
    , round((9*i.hra)/@InnPitch,1) AS 'HR9'
    , round((i.bb+i.ha)/@InnPitch,2) AS WHIP
    , round(i.k/i.bb,2) AS 'K/BB'
    , i.gb/i.fb AS 'gb/fb'
    , round((i.ha-i.hra)/(i.ab-i.k-i.hra-i.sh+i.sf),3) AS BABIP
    , round(i.er/@InnPitch,2) AS ERA
    , round(((13*i.hra)+(3*(i.bb+i.hp))-(2*i.k))/@InnPitch+f.FIPConstant,2) AS fip
    
    
FROM players_career_pitching_stats AS i
    INNER JOIN FIPConstant AS f ON i.year=f.year AND i.league_id=f.league_id
WHERE i.split_id=1;

So, how did it go?  Not great.  I took a random sample from my database and compared it to the game’s generated stats.  I wanted my FIP calculations to be within .05 of the game’s.

While most were in the medium range, it seems that there’s something different in the way the game calculates FIP.  Our numbers are close enough that it can’t be a major difference.  I’m going to follow a hunch and guess that it’s Hit By Pitch.  I will remove HBP as a factor in both the FIPConstant and FIP calculations and see what that does to our results.

I got about a third of the way through the revised calcs when I noticed a problem with the FIPConstant table.  This table pulls data from the league_history_pitching_stats table.  The problem is there.  You see, as I mentioned in the table setup posts and then promptly forgot about, there are a couple of columns in the league_history tables that attempt to distinguish between subleagues but do not give any indication of which is which. (They are the team_id and game_id columns.)  What this does create two records for each league (one for each subleague) with different totals but no way to identify the subleague being referenced.  This is no good.

My new hunch is that HBP is not the issue.  The formula is probably fine, I will just have to change the FIPConstant table to sum data from players_career_pitching_stats.  I’m going to publish this post as a testament to my naiveté and get to work on the revised table.

Pitching Stats 1: The Easy Stuff

Here’s the same explanation of how the stats tables are organized as we used in the first Batting Stats post:

Stats are collected for each player who accumulates them.  Each player gets his own row.  For each year that a player accumulates stats, a new row of data is created for that player.  For each team that a player plays in a given year (stint), a new row of data is created for that player.  Stats are accumulated and placed into three splits for each player-year-stint: Overall, vs. Left, and vs. Right.

As we did for the batting stats, we’ll be creating a new table for all of the pitching stats together in one place; counting stats provided by the game and calculated stats that we’ll derive here.

We’re carrying over all of the counting stats, plus WPA and WAR.  The calculated stats we’re adding in this post fall in the category of Easy Stuff:

  • InnPitch – I set this as a variable to avoid having to elaborate every time. This is the IP integer plus the IPF (innings pitched fraction) x 0.33
    round(IP + (IPF * .33),1).
  • All of the “x9” stats: K/9, BB/9 etc.
  • WHIP
  • GB/FB – Ground Ball/Fly Ball outs
  • BABIP (see the batting post for more on this)
  • ERA

Here’s the code:

DROP TABLE IF EXISTS CalcPitching;
CREATE TABLE IF NOT EXISTS CalcPitching AS

SELECT
    i.player_id
    , i.year
    , i.stint
    , i.team_id
    , i.league_id
    , split_id
    , i.ip
    , i.ab
    , i.tb
    , i.ha
    , i.k
    , i.bf
    , i.rs
    , i.bb
    , i.r
    , i.er
    , i.gb
    , i.fb
    , i.pi
    , i.ipf
    , i.g
    , i.gs
    , i.w
    , i.l
    , i.s
    , i.sa
    , i.da
    , i.sh
    , i.sf
    , i.ta
    , i.hra
    , i.bk
    , i.ci
    , i.iw
    , i.wp
    , i.hp
    , i.gf
    , i.dp
    , i.qs
    , i.svo
    , i.bs
    , i.ra
    , i.cg
    , i.sho
    , i.sb
    , i.cs
    , i.hld
    , i.ir
    , i.irs
    , i.wpa
    , i.li
    , i.outs
    , i.war
    , @InnPitch := round(i.ip + (i.ipf*.33),1) AS InnPitch
    , round((9*i.k)/@InnPitch,1) AS 'k9'
    , round((9*i.bb)/@InnPitch,1) AS 'bb9'
    , round((9*i.hra)/@InnPitch,1) AS 'HR9'
    , round((i.bb+i.ha)/@InnPitch,2) AS WHIP
    , round(i.k/i.bb,2) AS 'K/BB'
    , i.gb/i.fb AS 'gb/fb'
    , round((i.ha-i.hra)/(i.ab-i.k-i.hra-i.sh+i.sf),3) AS BABIP
    , round((i.er/@InnPitch)*9,2) AS ERA
    
    
FROM players_career_pitching_stats AS i;

Run Environment 6: Run Values 2

The last table for our Run Environment set is almost exactly the same as the one I presented before with two differences: we’re returning the runMinus and RunPlus from the intermediate table, and we’re using a variable to define wOBAscale- making the whole thing much more elegant and easy to understand.

# The last Run Values table, returning fields from RunValues1A, defining wOBAscale
# as a variable, and giving us the weighted factors by league year for batting events
DROP TABLE IF EXISTS tblRunValues2;
CREATE TABLE tblRunValues2 AS
SELECT year
, league_id
, RperOut 
, runBB 
, runHB 
, run1B 
, run2B 
, run3B 
, runHR 
, runSB 
, runCS 
, runMinus 
, runPlus 
, wOBA 
, @ws := 1/(runPlus+runMinus) AS wOBAscale 
, (runBB+runMinus)*@ws AS wobaBB 
, (runHB+runMinus)*@ws AS wobaHB 
, (run1B+runMinus)*@ws AS woba1B 
, (run2B+runMinus)*@ws AS woba2B 
, (run3B+runMinus)*@ws AS woba3B 
, (runHR+runMinus)*@ws AS wobaHR 
, runSB*@ws AS wobaSB 
, runCS*@ws AS wobaCS 
FROM tblRunValues1A;

This is the table that we’ll refer to when calculating wOBA for individual players, factoring wobaXX against their counting stats XX.  It’ll make more sense when we actually do it.

This is almost all of the data we need to complete our batting stats.  What remains is park factor.  I had been hoping that we could use the factors listed on the parks table.  Unfortunately, OOTP’s resident MySQL Whisperer says it’s not true. I’m going to set that aside for today- while I figure out how to calculate park factors – and use an incremental approach to developing the batting stats view.  I figure that will give me one, maybe two, posts to get a good method for that.

Run Environment 5: An Intermediate Table

In addition to using variables, Wyers also does the other thing I had thought about using: an intermediate table to store the runMinus, runPlus, and wOBAScale values.  On the one hand, this makes sense as we will only have to write out the calculations once and then refer to them in this table when we need them.  On the other hand, if we managed to use variables for defining the run values in the previous table, we should be able to do so with these values as well.

Also, in this table, we define league wOBA.  It makes sense that the weighted On Base Average for the entire league is just the entire league’s On Base Percentage.  If we’re looking at an average for the entire league, there is nothing to weigh it against.  We’ll see this idea again when we look at wRC+.

A few minor adjustments to the code to allow for OOTP’s naming conventions and one final difference:  I’m taking the batting stats from the league_history_batting table rather than the players_ file.  It’s already there and there are far fewer rows to sort through.  And given that we’re not filtering out any stats, it’s silly not to use it.  And, because we’re using it, we can use the column for singles (s) rather than deriving singles from H-2b-3b-HR.

#Creating an intermediate table so as not to have to write out formulae for rumMinus etc.
DROP TABLE IF EXISTS tblRunValues1A;
CREATE TABLE IF NOT EXISTS tblRunValues1A AS
SELECT r.year
, r.league_id
, r.RperOut 
, r.runBB 
, r.runHB 
, r.run1B 
, r.run2B 
, r.run3B 
, r.runHR 
, r.runSB 
, r.runCS 
, SUM(runBB*(BB-IBB) + runHB * HP + run1B * s + run2B * d 
   + run3B * t + 1.4 * HR + runSB * SB - runCS * CS) 
   / SUM(AB - H + SF) AS runMinus 

, SUM(runBB * (BB-IBB) + runHB * HP + run1B * s + run2B * d 
   + run3B * t + 1.4 * HR + runSB * SB - runCS * CS) 
   / SUM(BB-IBB + HP + H) AS runPlus

, SUM(H+BB-IBB+HP) / SUM(AB+BB-IBB+HP+SF) AS wOBA 
 
FROM tblRunValues r
INNER JOIN league_history_batting_stats 
   ON r.year=league_history_batting_stats.year 
   AND r.league_id=league_history_batting_stats.league_id

 
GROUP BY 
r.year
, r.league_id
, r.RperOut 
, r.runBB 
, r.runHB 
, r.run1B 
, r.run2B 
, r.run3B 
, r.runHR 
, r.runSB 
, r.runCS 
 
ORDER BY 
r.year DESC;

Tables 10: Team History Financials and Team Relations

Creating team_history_financials table

This table tracks a whole bunch of team financial data per year.  Useful in doing team by team financial comparisons as well as deriving fun statistics such as $/win or looking at correlation between fan interest/loyalty and wins. Of note, the inclusion of league, sub_league, and division columns is a little confusing and seems to be asking for trouble.  Those relationships are established on the ‘teams’ table and seem superfluous here.

CREATE TABLE `team_history_financials` (
  `team_id` int(11) NOT NULL,
  `year` smallint(6) NOT NULL,
  `league_id` int(11) DEFAULT NULL,
  `sub_league_id` int(11) DEFAULT NULL,
  `division_id` int(11) DEFAULT NULL,
  `gate_revenue` int(11) DEFAULT NULL,
  `season_ticket_revenue` int(11) DEFAULT NULL,
  `media_revenue` int(11) DEFAULT NULL,
  `merchandising_revenue` int(11) DEFAULT NULL,
  `other_revenue` int(11) DEFAULT NULL,
  `revenue_sharing` int(11) DEFAULT NULL,
  `playoff_revenue` int(11) DEFAULT NULL,
  `cash` int(11) DEFAULT NULL,
  `cash_owner` int(11) DEFAULT NULL,
  `cash_trades` int(11) DEFAULT NULL,
  `previous_balance` int(11) DEFAULT NULL,
  `player_expenses` int(11) DEFAULT NULL,
  `staff_expenses` int(11) DEFAULT NULL,
  `stadium_expenses` int(11) DEFAULT NULL,
  `season_tickets` int(11) DEFAULT NULL,
  `attendance` int(11) DEFAULT NULL,
  `fan_interest` smallint(6) DEFAULT NULL,
  `fan_loyalty` smallint(6) DEFAULT NULL,
  `fan_modifier` smallint(6) DEFAULT NULL,
  `ticket_price` double DEFAULT NULL,
  `local_media_contract` int(11) DEFAULT NULL,
  `local_media_contract_expires` int(11) DEFAULT NULL,
  `national_media_contract` int(11) DEFAULT NULL,
  `national_media_contract_expires` int(11) DEFAULT NULL,
  `development_budget` int(11) DEFAULT NULL,
  `draft_budget` int(11) DEFAULT NULL,
  `draft_expenses` int(11) DEFAULT NULL,
  `budget` int(11) DEFAULT NULL,
  `market` smallint(6) DEFAULT NULL,
  `owner_expectation` smallint(6) DEFAULT NULL,
  `scouting_budget` int(11) DEFAULT NULL,
  `scouting_amateur` int(11) DEFAULT NULL,
  `scouting_major` int(11) DEFAULT NULL,
  `scouting_minor` int(11) DEFAULT NULL,
  `scouting_international` int(11) DEFAULT NULL,
  PRIMARY KEY (`team_id`,`year`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

 

Creating team_relations table

I think I included this table in the dump because I thought it did what team_affiliations does.  It does not.  I think this table is only useful for showing current standings without having to do complicated joins on the teams, leagues, sub_leagues, and divisions tables.  Well, not complicated exactly, but cumbersome.

CREATE TABLE `team_relations` (
  `league_id` int(11) NOT NULL,
  `sub_league_id` int(11) NOT NULL,
  `division_id` int(11) NOT NULL,
  `team_id` int(11) NOT NULL,
  PRIMARY KEY (`league_id`,`sub_league_id`,`division_id`,`team_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Tables 8: Player Career Pitching Stats

Same breakdown here as the previous career stats tables, with indexes on player_id, year, split_id, stint, and team_id.  Similar to the batting stats table, we have a couple calculated statistics here: WPA (though I am not sure how this works for pitchers…is it per AB?), and WAR.  We also have a column that I haven’t been able to identify: ‘li’.

I will come back to this in another section, but this is the table we’ll reference when aggregating data for Run Environments as we don’t have a reliable way to group league history pitching data by subleague or of calculating outs.  On this table, outs are conveniently recorded in the ‘outs’ column.

CREATE TABLE `players_career_pitching_stats` (
  `pcps_id` int(11) NOT NULL AUTO_INCREMENT,    
  `player_id` int(11) NOT NULL,
  `year` smallint(6) NOT NULL,
  `team_id` int(11) NOT NULL,
  `game_id` int(11) DEFAULT NULL,
  `league_id` int(11) DEFAULT NULL,
  `level_id` smallint(6) DEFAULT NULL,
  `split_id` smallint(6) NOT NULL,
  `ip` smallint(6) DEFAULT NULL,
  `ab` smallint(6) DEFAULT NULL,
  `tb` smallint(6) DEFAULT NULL,
  `ha` smallint(6) DEFAULT NULL,
  `k` smallint(6) DEFAULT NULL,
  `bf` smallint(6) DEFAULT NULL,
  `rs` smallint(6) DEFAULT NULL,
  `bb` smallint(6) DEFAULT NULL,
  `r` smallint(6) DEFAULT NULL,
  `er` smallint(6) DEFAULT NULL,
  `gb` smallint(6) DEFAULT NULL,
  `fb` smallint(6) DEFAULT NULL,
  `pi` smallint(6) DEFAULT NULL,
  `ipf` smallint(6) DEFAULT NULL,
  `g` smallint(6) DEFAULT NULL,
  `gs` smallint(6) DEFAULT NULL,
  `w` smallint(6) DEFAULT NULL,
  `l` smallint(6) DEFAULT NULL,
  `s` smallint(6) DEFAULT NULL,
  `sa` smallint(6) DEFAULT NULL,
  `da` smallint(6) DEFAULT NULL,
  `sh` smallint(6) DEFAULT NULL,
  `sf` smallint(6) DEFAULT NULL,
  `ta` smallint(6) DEFAULT NULL,
  `hra` smallint(6) DEFAULT NULL,
  `bk` smallint(6) DEFAULT NULL,
  `ci` smallint(6) DEFAULT NULL,
  `iw` smallint(6) DEFAULT NULL,
  `wp` smallint(6) DEFAULT NULL,
  `hp` smallint(6) DEFAULT NULL,
  `gf` smallint(6) DEFAULT NULL,
  `dp` smallint(6) DEFAULT NULL,
  `qs` smallint(6) DEFAULT NULL,
  `svo` smallint(6) DEFAULT NULL,
  `bs` smallint(6) DEFAULT NULL,
  `ra` smallint(6) DEFAULT NULL,
  `cg` smallint(6) DEFAULT NULL,
  `sho` smallint(6) DEFAULT NULL,
  `sb` smallint(6) DEFAULT NULL,
  `cs` smallint(6) DEFAULT NULL,
  `hld` smallint(6) DEFAULT NULL,
  `ir` double DEFAULT NULL,
  `irs` double DEFAULT NULL,
  `wpa` double DEFAULT NULL,
  `li` double DEFAULT NULL,
  `stint` smallint(6) NOT NULL,
  `outs` smallint(6) DEFAULT NULL,
  `war` double DEFAULT NULL,
  PRIMARY KEY (`pcps_id`),
  INDEX `pcps_ix1` (`league_id`),
  INDEX `pcps_ix2` (year),
  INDEX `pcps_ix3` (`player_id`),
  INDEX `pcps_ix4` (`team_id`),
  INDEX `pcps_ix5` (`split_id`)  
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Tables 5: The Players Table

Creating players table

This table contains all of the biographic and demographic detail of all players.  A couple notes here.

  • ‘team_id’ and ‘league_id’ refer to the team and league of a given player at the time of the last data dump.  We will be ignoring these columns from this table in all of our career and prior-year stats.  They will be useful when we’re looking for a player’s team mid-season.
  • At some point, I posted a request for a new feature.  The request was that although we can view a coach’s former player profile, we can’t see a player’s coach profile if that player had retired and became a coach.  That feature can be implemented by adding a column ‘coach_id’ to this table.
CREATE TABLE `players` (
  `player_id` int(11) NOT NULL,
  `team_id` int(11) DEFAULT NULL,
  `league_id` int(11) DEFAULT NULL,
  `first_name` varchar(50) DEFAULT NULL,
  `last_name` varchar(50) DEFAULT NULL,
  `nick_name` varchar(50) DEFAULT NULL,
  `age` smallint(6) DEFAULT NULL,
  `date_of_birth` date DEFAULT NULL,
  `city_of_birth_id` int(11) DEFAULT NULL,
  `nation_id` int(11) DEFAULT NULL,
  `second_nation_id` int(11) DEFAULT NULL,
  `weight` smallint(6) DEFAULT NULL,
  `height` smallint(6) DEFAULT NULL,
  `retired` tinyint(4) DEFAULT NULL,
  `free_agent` tinyint(4) DEFAULT NULL,
  `last_league_id` int(11) DEFAULT NULL,
  `last_team_id` int(11) DEFAULT NULL,
  `organization_id` int(11) DEFAULT NULL,
  `last_organization_id` int(11) DEFAULT NULL,
  `language_ids0` int(11) DEFAULT NULL,
  `language_ids1` int(11) DEFAULT NULL,
  `uniform_number` smallint(6) DEFAULT NULL,
  `experience` smallint(6) DEFAULT NULL,
  `person_type` smallint(6) DEFAULT NULL,
  `bats` smallint(6) DEFAULT NULL,
  `throws` smallint(6) DEFAULT NULL,
  `personality_greed` smallint(6) DEFAULT NULL,
  `personality_loyalty` smallint(6) DEFAULT NULL,
  `personality_play_for_winner` smallint(6) DEFAULT NULL,
  `personality_work_ethic` smallint(6) DEFAULT NULL,
  `personality_intelligence` smallint(6) DEFAULT NULL,
  `personality_leader` smallint(6) DEFAULT NULL,
  `historical_id` varchar(50) DEFAULT NULL,
  `historical_team_id` varchar(50) DEFAULT NULL,
  `best_contract_offer_id` int(11) DEFAULT NULL,
  `injury_is_injured` tinyint(4) DEFAULT NULL,
  `injury_dtd_injury` tinyint(4) DEFAULT NULL,
  `injury_career_ending` tinyint(4) DEFAULT NULL,
  `injury_dl_left` smallint(6) DEFAULT NULL,
  `injury_dl_playoff_round` smallint(6) DEFAULT NULL,
  `injury_left` smallint(6) DEFAULT NULL,
  `dtd_injury_effect` smallint(6) DEFAULT NULL,
  `injury_id` int(11) DEFAULT NULL,
  `prone_overall` smallint(6) DEFAULT NULL,
  `prone_leg` smallint(6) DEFAULT NULL,
  `prone_back` smallint(6) DEFAULT NULL,
  `prone_arm` smallint(6) DEFAULT NULL,
  `fatigue_pitches0` smallint(6) DEFAULT NULL,
  `fatigue_pitches1` smallint(6) DEFAULT NULL,
  `fatigue_pitches2` smallint(6) DEFAULT NULL,
  `fatigue_pitches3` smallint(6) DEFAULT NULL,
  `fatigue_pitches4` smallint(6) DEFAULT NULL,
  `fatigue_pitches5` smallint(6) DEFAULT NULL,
  `fatigue_points` smallint(6) DEFAULT NULL,
  `fatigue_played_today` tinyint(4) DEFAULT NULL,
  `running_ratings_speed` smallint(6) DEFAULT NULL,
  `running_ratings_stealing` smallint(6) DEFAULT NULL,
  `running_ratings_baserunning` smallint(6) DEFAULT NULL,
  `position` smallint(6) DEFAULT NULL,
  `role` smallint(6) DEFAULT NULL,
  `college` tinyint(4) DEFAULT NULL,
  `draft_year` smallint(6) DEFAULT NULL,
  `draft_round` smallint(6) DEFAULT NULL,
  `draft_supplemental` tinyint(4) DEFAULT NULL,
  `draft_pick` smallint(6) DEFAULT NULL,
  `draft_overall_pick` smallint(6) DEFAULT NULL,
  `draft_eligible` tinyint(4) DEFAULT NULL,
  `hidden` tinyint(4) DEFAULT NULL,
  `draft_league_id` int(11) DEFAULT NULL,
  `draft_team_id` int(11) DEFAULT NULL,
  `turned_coach` tinyint(4) DEFAULT NULL,
  `hall_of_fame` tinyint(4) DEFAULT NULL,
  `rust` smallint(6) DEFAULT NULL,
  `inducted` smallint(6) DEFAULT NULL,
  `strategy_override_team` tinyint(4) DEFAULT NULL,
  `strategy_stealing` int(11) DEFAULT NULL,
  `strategy_running` int(11) DEFAULT NULL,
  `strategy_bunt_for_hit` int(11) DEFAULT NULL,
  `strategy_sac_bunt` int(11) DEFAULT NULL,
  `strategy_hit_run` int(11) DEFAULT NULL,
  `strategy_hook_start` int(11) DEFAULT NULL,
  `strategy_hook_relief` int(11) DEFAULT NULL,
  `strategy_pitch_count` int(11) DEFAULT NULL,
  `strategy_only_allow_single_inning_saves` tinyint(4) DEFAULT NULL,
  `strategy_pitch_around` int(11) DEFAULT NULL,
  `strategy_sticky_lineup` tinyint(4) DEFAULT NULL,
  `strategy_sticky_position` tinyint(4) DEFAULT NULL,
  `strategy_no_pinch_if_rested` tinyint(4) DEFAULT NULL,
  `strategy_never_pinch_hit` tinyint(4) DEFAULT NULL,
  `strategy_defensive_sub` tinyint(4) DEFAULT NULL,
  `strategy_never_defensive_sub_for` tinyint(4) DEFAULT NULL,
  `local_pop` smallint(6) DEFAULT NULL,
  `national_pop` smallint(6) DEFAULT NULL,
  `draft_protected` tinyint(4) DEFAULT NULL,
  `morale` smallint(6) DEFAULT NULL,
  `morale_player_performance` smallint(6) DEFAULT NULL,
  `morale_team_performance` smallint(6) DEFAULT NULL,
  `morale_team_transactions` smallint(6) DEFAULT NULL,
  `expectation` smallint(6) DEFAULT NULL,
  `morale_player_role` smallint(6) DEFAULT NULL,
  PRIMARY KEY (`player_id`),
  INDEX `players_ix1` (`team_id`),
  INDEX `players_ix2` (`league_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Tables 4: Parks and Teams

Creating parks table

Parks is another one of those tables that has a ton of columns of which we need only a couple.  It’s a standalone table that doesn’t have any columns that could be construed as a foreign key (if we were using them).  We’re really only interested in ‘park_id’ and the park factor columns about halfway down:

  • ‘avg’
  • ‘avg_l’
  • ‘avg_r’
  • ‘d’
  • ‘t’
  • ‘hr’
  • ‘hr_l’
  • ‘hr_r’

And of those, at least for the time being, we’re only interested in ‘avg’.  This is the factor we’ll be using to adjust some of the advanced statistics by to account for the specific conditions of the park in which each player played his home games.

CREATE TABLE `parks` (
  `park_id` int(11) NOT NULL,
  `dimensions_x` smallint(6) DEFAULT NULL,
  `dimensions_y` smallint(6) DEFAULT NULL,
  `batter_left_x` smallint(6) DEFAULT NULL,
  `batter_left_y` smallint(6) DEFAULT NULL,
  `batter_right_x` smallint(6) DEFAULT NULL,
  `batter_right_y` smallint(6) DEFAULT NULL,
  `bases_x0` smallint(6) DEFAULT NULL,
  `bases_x1` smallint(6) DEFAULT NULL,
  `bases_x2` smallint(6) DEFAULT NULL,
  `bases_y0` smallint(6) DEFAULT NULL,
  `bases_y1` smallint(6) DEFAULT NULL,
  `bases_y2` smallint(6) DEFAULT NULL,
  `positions_x0` smallint(6) DEFAULT NULL,
  `positions_x1` smallint(6) DEFAULT NULL,
  `positions_x2` smallint(6) DEFAULT NULL,
  `positions_x3` smallint(6) DEFAULT NULL,
  `positions_x4` smallint(6) DEFAULT NULL,
  `positions_x5` smallint(6) DEFAULT NULL,
  `positions_x6` smallint(6) DEFAULT NULL,
  `positions_x7` smallint(6) DEFAULT NULL,
  `positions_x8` smallint(6) DEFAULT NULL,
  `positions_x9` smallint(6) DEFAULT NULL,
  `positions_y0` smallint(6) DEFAULT NULL,
  `positions_y1` smallint(6) DEFAULT NULL,
  `positions_y2` smallint(6) DEFAULT NULL,
  `positions_y3` smallint(6) DEFAULT NULL,
  `positions_y4` smallint(6) DEFAULT NULL,
  `positions_y5` smallint(6) DEFAULT NULL,
  `positions_y6` smallint(6) DEFAULT NULL,
  `positions_y7` smallint(6) DEFAULT NULL,
  `positions_y8` smallint(6) DEFAULT NULL,
  `positions_y9` smallint(6) DEFAULT NULL,
  `avg` double DEFAULT NULL,
  `avg_l` double DEFAULT NULL,
  `avg_r` double DEFAULT NULL,
  `d` double DEFAULT NULL,
  `t` double DEFAULT NULL,
  `hr` double DEFAULT NULL,
  `hr_r` double DEFAULT NULL,
  `hr_l` double DEFAULT NULL,
  `temperature0` smallint(6) DEFAULT NULL,
  `temperature1` smallint(6) DEFAULT NULL,
  `temperature2` smallint(6) DEFAULT NULL,
  `temperature3` smallint(6) DEFAULT NULL,
  `temperature4` smallint(6) DEFAULT NULL,
  `temperature5` smallint(6) DEFAULT NULL,
  `temperature6` smallint(6) DEFAULT NULL,
  `temperature7` smallint(6) DEFAULT NULL,
  `temperature8` smallint(6) DEFAULT NULL,
  `temperature9` smallint(6) DEFAULT NULL,
  `temperature10` smallint(6) DEFAULT NULL,
  `temperature11` smallint(6) DEFAULT NULL,
  `rain0` smallint(6) DEFAULT NULL,
  `rain1` smallint(6) DEFAULT NULL,
  `rain2` smallint(6) DEFAULT NULL,
  `rain3` smallint(6) DEFAULT NULL,
  `rain4` smallint(6) DEFAULT NULL,
  `rain5` smallint(6) DEFAULT NULL,
  `rain6` smallint(6) DEFAULT NULL,
  `rain7` smallint(6) DEFAULT NULL,
  `rain8` smallint(6) DEFAULT NULL,
  `rain9` smallint(6) DEFAULT NULL,
  `rain10` smallint(6) DEFAULT NULL,
  `rain11` smallint(6) DEFAULT NULL,
  `wind` smallint(6) DEFAULT NULL,
  `wind_direction` smallint(6) DEFAULT NULL,
  `distances0` smallint(6) DEFAULT NULL,
  `distances1` smallint(6) DEFAULT NULL,
  `distances2` smallint(6) DEFAULT NULL,
  `distances3` smallint(6) DEFAULT NULL,
  `distances4` smallint(6) DEFAULT NULL,
  `distances5` smallint(6) DEFAULT NULL,
  `distances6` smallint(6) DEFAULT NULL,
  `wall_heights0` smallint(6) DEFAULT NULL,
  `wall_heights1` smallint(6) DEFAULT NULL,
  `wall_heights2` smallint(6) DEFAULT NULL,
  `wall_heights3` smallint(6) DEFAULT NULL,
  `wall_heights4` smallint(6) DEFAULT NULL,
  `wall_heights5` smallint(6) DEFAULT NULL,
  `wall_heights6` smallint(6) DEFAULT NULL,
  `name` varchar(100) DEFAULT NULL,
  `picture` varchar(200) DEFAULT NULL,
  `picture_night` varchar(200) DEFAULT NULL,
  `nation_id` int(11) DEFAULT NULL,
  `capacity` int(11) DEFAULT NULL,
  `type` smallint(6) DEFAULT NULL,
  `foul_ground` smallint(6) DEFAULT NULL,
  `turf` tinyint(4) DEFAULT NULL,
  `gender` int(11) DEFAULT NULL,
  `relative_path_3d_model` varchar(200) DEFAULT NULL,
  `file_name_3d_model` varchar(200) DEFAULT NULL,
  `home_team_dugout_is_at_first_base` tinyint(4) DEFAULT NULL,
  PRIMARY KEY (`park_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

 

Creating teams table

This table contains, obviously, information about all of the teams in our OOTP universe.  Again, there are a bunch of columns that are only useful in-game.  We’re interested in the teams’ names, nicknames, league, sub_league, and park_id.

CREATE TABLE `teams` (
  `team_id` int(11) NOT NULL,
  `name` varchar(50) DEFAULT NULL,
  `abbr` varchar(50) DEFAULT NULL,
  `nickname` varchar(50) DEFAULT NULL,
  `logo_file_name` varchar(200) DEFAULT NULL,
  `city_id` int(11) DEFAULT NULL,
  `park_id` int(11) DEFAULT NULL,
  `league_id` int(11) DEFAULT NULL,
  `sub_league_id` int(11) DEFAULT NULL,
  `division_id` int(11) DEFAULT NULL,
  `nation_id` int(11) DEFAULT NULL,
  `parent_team_id` int(11) DEFAULT NULL,
  `level` int(11) DEFAULT NULL,
  `prevent_any_moves` tinyint(4) DEFAULT NULL,
  `human_team` tinyint(4) DEFAULT NULL,
  `human_id` int(11) DEFAULT NULL,
  `gender` int(11) DEFAULT NULL,
  `background_color_id` varchar(8) DEFAULT NULL,
  `text_color_id` varchar(8) DEFAULT NULL,
  `ballcaps_main_color_id` varchar(8) DEFAULT NULL,
  `ballcaps_visor_color_id` varchar(8) DEFAULT NULL,
  `jersey_main_color_id` varchar(8) DEFAULT NULL,
  `jersey_away_color_id` varchar(8) DEFAULT NULL,
  `jersey_secondary_color_id` varchar(8) DEFAULT NULL,
  `jersey_pin_stripes_color_id` varchar(8) DEFAULT NULL,
  `allstar_team` tinyint(4) DEFAULT NULL,
  `historical_id` varchar(50) DEFAULT NULL,
  PRIMARY KEY (`team_id`),
  INDEX `teams_ix1` (`park_id`),
  INDEX `teams_ix2` (`league_id`),
  INDEX `teams_ix3` (`sub_league_id`),
  INDEX `teams_ix4` (`division_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;