Batting Stats 2: Easy Calculated Stats

I’ll be adding to the scripts started yesterday working towards a complete script for all batting stats.  I’ll post the relevant bits of the code I’m working on in the main body and post the complete script as it stands beneath the fold.

I’ll be tackling the easy calculated stats in this post, starting with batting average.  Easy and not worth a discussion.

Next are K% and BB%.  These stats measure the percentage a player’s plate appearances end in a strikeout or walk, respectively.

I made an interesting discovery while coding these easy stats.  As PA is a calculated stat, I was not looking forward to repeating the definition of PA (AB+BB+IBB+HP+SF+SH). A minor annoyance, but for a reason that promises larger annoyances down the road.  I thought about how, in Wyers’s Run Value tables, we used variables that were defined and used in expressions in the same statement.  I figured it was worth a try.  I created a view with PA defined as @PA := AB+BB+IBB+HP+SF+SH AS PA and used @PA as the denominator in the expressions. I got an error telling me I couldn’t do that. Figures. But why? It worked for Wyers – so what was the difference? The only one I could think of was that he used his variables in creating a table and I used mine creating a view. So, I switched from a view to a table. It worked! Here are the relevant lines of code:

DROP TABLE IF EXISTS CalcBatting;
    CREATE TABLE IF NOT EXISTS CalcBatting AS

   SELECT b.year
    , b.league_id
    , b.player_id
    , b.stint #We can eventually move this down the list
    , b.split_id #We can eventually remove
    , b.team_id #We can eventually move this down the list
    , l.abbr as Lg
    , t.abbr as Team
    , b.g
    , b.ab
    , @PA := b.ab+b.bb+b.sh+b.sf+b.hp AS PA
    , round(b.h/b.ab,3) AS ba
    , round(b.k/@PA,3) as krate
    , round((b.bb+b.ibb)/@PA,3) as bbrate
   FROM 
      players_career_batting_stats b 
      INNER JOIN leagues l ON      b.league_id=l.league_id 
      INNER JOIN teams t ON b.team_id=t.team_id
   WHERE b.ab<>0 AND b.split_id=1
   ORDER BY b.player_id, b.year

 

On Base Percentage is next and, since I’ll be using that stat again in this statement, I decided to define it as a variable as well: @OBP := round((b.h + b.bb + b.hp)/@PA,3) AS obp. This worked nicely, giving me identical results to when done without a variable.

Next up is OBP+.  Not a very often used stat, and not present in OOTP, but I like it.  It presents a player’s OBP in relation to the rest of the league average which is set to 100.  So, a player with a 103 OBP+ is 3% higher than league average.  wOBA is always relative to what the current league OBP is, so for any nuanced understanding of a player’s wOBA you need to have an understanding of what the current league OBP is.  Don’t get me wrong, wOBA is one of the first offensive stats I look at, but for a quick glance at the rate a player gets on base vs the rest of the world, OBP+ is great.

The formula for OBP+ is: 100 * (On Base Percentage / League On Base Percentage).

Remember, we calculated League wOBA in the Intermediate Run Values Table and referenced it in the table RunValues2, so we can call that up pretty easily. We’ll have to add another join to our statement, and that will give us:

DROP TABLE IF EXISTS CalcBatting;
    CREATE TABLE IF NOT EXISTS CalcBatting AS

    SELECT b.year
    , b.league_id
    , b.player_id
    , b.stint #We can eventually move this down the list
    , b.split_id #We can eventually remove
    , b.team_id #We can eventually move this down the list
    , l.abbr as Lg
    , t.abbr as Team
    , b.g
    , b.ab
    , @PA := b.ab+b.bb+b.sh+b.sf+b.hp AS PA
    , round(b.h/b.ab,3) AS ba
    , round(b.k/@PA,3) as krate
    , round((b.bb)/@PA,3) as bbrate
    , @OBP := round((b.h + b.bb + b.hp)/@PA,3) AS obp
    , round(100*(@OBP/r.woba),0) as 'OBP+'
FROM 
      players_career_batting_stats b 
      INNER JOIN leagues l ON b.league_id=l.league_id 
      INNER JOIN teams t ON b.team_id=t.team_id
      INNER JOIN tblRunValues2 r ON b.year=r.year AND b.league_id=r.league_id
    WHERE b.ab<>0 AND b.split_id=1
    ORDER BY b.player_id, b.year

I’m going to stop here for now as I’ve hit a little snag.  My calculations for OBP aren’t matching up 100% against the ones that the game gives us.  Only off a couple points for full-season samples but much more with smaller ones.  I’d like to at least understand what’s happening, if not correct it, before moving on much further.  I posted about it here, and maybe the community can help me figure out the issue.  Spot checking a few players in both my database and the game, evaluating it in Excel has shown me that my database is calculating the formula correctly.  So, maybe the game is using a different formula?

Continue reading “Batting Stats 2: Easy Calculated Stats”

Batting Stats 1: The Easy Stuff

The goal here is to create a table (or view) for all offensive stats for each player-year-stint-split.  Let’s go back and parse that briefly:  Stats are collected for each player who accumulates them.  Each player gets his own row.  For each year that a player accumulates stats, a new row of data is created for that player.  For each team that a player plays in a given year (stint), a new row of data is created for that player.  Stats are accumulated and placed into three splits for each player-year-stint: Overall, vs. Left, and vs. Right.

I will most likely create views for each split, but for the moment I am going to focus only on Overall (split_id=1).

As this view is where I have gotten tripped up in the past, I’m going to take it slowly here, checking against game data periodically.  So, easy bit first: Let’s get the counting stats and the more traditional rate stats out of the way:

#Calculated batting stats for OOTP
    CREATE OR REPLACE VIEW CalcBatting AS

    SELECT b.year
    , b.league_id
    , b.player_id
    , b.stint #We can eventually move this down the list
    , b.split_id #We can eventually remove this
    , b.team_id #We can eventually move this down the list
    , l.abbr as Lg
    , t.abbr as Team
    , b.g
    , b.ab
    , b.ab+b.bb+b.sh+b.sf+b.hp as PA
    , b.r 
    , b.h
    , b.d
    , b.t
    , b.hr
    , b.rbi
    , b.sb
    , b.cs
    , b.bb
    , b.k
    , b.ibb
    , b.hp
    , b.sh
    , b.sf
    , b.gdp
        FROM 
      players_career_batting_stats b 
      INNER JOIN leagues l ON b.league_id=l.league_id 
      INNER JOIN teams t ON b.team_id=t.team_id
    WHERE b.ab<>0 AND b.split_id=1
    ORDER BY b.player_id, b.year

We’ll pick a random player from in-game and make sure we line up.  Let’s pick player #14 from 2015 and compare to OOTP.  Not expecting any problems on the easy stuff.  Still, just to be sure.

Output from calcbatting
Frank Garcia

So far so good.  Next post will finish up with the standard stats and some of the easier advanced stats.