This table contains all batting statistics for all players for all years. If data is dumped in the middle of a year, that year’s stats are also in this table. There are multiple records for every year based on a number of columns. I added an AUTO_INCREMENT id column to avoid a composite PK made up of the following:
- ‘player_id’
- ‘year’
- ‘split_id’ – as mentioned earlier, this breaks out a players stats overall, vs lefties, and vs righties
- ‘stint’ – a player gets a new record if he changes teams during the course of a season
We’re indexing all of these as well as ‘team_id’.
There are two columns that contain static records of calculated stats: ‘wpa’ and ‘war’.
Win Percentage Added is a stat that I don’t find much use for. Essentially it sums the percentage increase of a team’s chances of winning after an at-bat. I believe it is a real measure, but I feel that there are other measures that better capture offensive contribution without relying on situational hitting and specific game scenarios.
Wins Above Replacement is a counting stat that takes offense, base running, and defense into account and represents it as a number of wins that a player is responsible for above a replacement level player. It’s a complicated formula that relies on some defensive metrics that I don’t fully understand yet, so it’s nice that it’s here. Particularly because this is a counting stat, it can stand alone.
CREATE TABLE `players_career_batting_stats` (
`pcbs_id` int(11) NOT NULL AUTO_INCREMENT,
`player_id` int(11) NOT NULL,
`year` smallint(6) NOT NULL,
`team_id` int(11) NOT NULL,
`game_id` int(11) DEFAULT NULL,
`league_id` int(11) DEFAULT NULL,
`level_id` smallint(6) DEFAULT NULL,
`split_id` smallint(6) NOT NULL,
`position` smallint(6) DEFAULT NULL,
`ab` smallint(6) DEFAULT NULL,
`h` smallint(6) DEFAULT NULL,
`k` smallint(6) DEFAULT NULL,
`pa` smallint(6) DEFAULT NULL,
`pitches_seen` smallint(6) DEFAULT NULL,
`g` smallint(6) DEFAULT NULL,
`gs` smallint(6) DEFAULT NULL,
`d` smallint(6) DEFAULT NULL,
`t` smallint(6) DEFAULT NULL,
`hr` smallint(6) DEFAULT NULL,
`r` smallint(6) DEFAULT NULL,
`rbi` smallint(6) DEFAULT NULL,
`sb` smallint(6) DEFAULT NULL,
`cs` smallint(6) DEFAULT NULL,
`bb` smallint(6) DEFAULT NULL,
`ibb` smallint(6) DEFAULT NULL,
`gdp` smallint(6) DEFAULT NULL,
`sh` smallint(6) DEFAULT NULL,
`sf` smallint(6) DEFAULT NULL,
`hp` smallint(6) DEFAULT NULL,
`ci` smallint(6) DEFAULT NULL,
`wpa` double DEFAULT NULL,
`stint` smallint(6) NOT NULL,
`war` double DEFAULT NULL,
PRIMARY KEY (`pcbs_id`),
INDEX `pcbs_ix1` (`league_id`),
INDEX `pcbs_ix2` (year),
INDEX `pcbs_ix3` (`player_id`),
INDEX `pcbs_ix4` (`team_id`),
INDEX `pcbs_ix5` (`split_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
[…] much to say about this table at the moment. I set up indexes on the same fields as I did for the Career Batting Stats […]