Batting Stats 9: wRAA

Weighted Runs Above Average (wRAA) takes Weighted Runs Created (wRC) and puts it in the context of comparing it to the average player.  Below, we see one of our test subjects, Jorge Acosta.

In 2015, Acosta accumulated 56.9 wRC.  Is that good?  As with all counting stats, it depends.  Like other counting stats, the number of plate appearances matters.  (I know I have AB shown here, but bear with me.) A player who racks up 50 RBIs over a full season isn’t terribly impressive.  One who does it in 200 plate appearances is amazing.  Same with wRC.  Fangraphs gives us some general guidelines on what’s good, great, average, etc. over the course of a full season.  Still, that’s a number that is devoid of any context.  How does that stack up against other players in the league?

That’s where wRAA comes in.  League average wRAA is always 0.  So, looking at Acosta with his -7.3 wRAA, we see that his 56.9 wRC is not impressive at all: 7% below league average.  In 2015, Acosta played like a scrub.

We derive wRAA thusly:
(wOBA-lg_woba)/wOBAscale)*PA

It does, and should, look very similar to wRC.

My testing paramaters are based on the idea that there’s generally around 10 points separating the middle categories of poor to above average.  In order that my stats stay close to the game’s, I started off with a threshold of 3 points.  All of my results were within that range, so I tightened up a bit.  I changed my “good” zone to 1.0.  Here are my results:

Again, I’m really happy with how this turned out.  I don’t think I need to do any tweaking or further investigation.  Ready to move on!

Our script with the addition of wRAA:

#Calculated batting stats for OOTP
    DROP TABLE IF EXISTS CalcBatting;
    CREATE TABLE IF NOT EXISTS CalcBatting AS

    SELECT b.year
    , b.league_id
    , b.player_id
    , b.stint #We can eventually move this down the list
    , b.split_id #We can eventually remove
    , b.team_id #We can eventually move this down the list
    , l.abbr as Lg
    , t.abbr as Team
    , b.g
    , b.ab
    , @PA := b.ab+b.bb+b.sh+b.sf+b.hp AS PA
    , b.r 
    , b.h
    , b.d
    , b.t
    , b.hr
    , b.rbi
    , b.sb
    , b.cs
    , b.bb
    , b.k
    , b.ibb
    , b.hp
    , b.sh
    , b.sf
    , b.gdp
    , b.ci
    , @BA := round(b.h/b.ab,3) AS ba
    , round(b.k/@PA,3) as krate
    , round((b.bb)/@PA,3) as bbrate
    , @OBP := round((b.h + b.bb + b.hp)/(@PA-b.sh-b.ci),3) AS obp
    , round(100*(@OBP/r.woba),0) as OBPplus
    , @SLG := round((b.h+b.d+2*b.t+3*b.hr)/b.ab,3) as slg
    , round(@OBP+@SLG,3) as ops
    , round(@SLG-@BA,3) as iso
    , round((b.h-b.hr)/(b.ab-b.k-b.hr+b.sf),3) as babip
    , @woba := round((r.wobaBB*(b.bb-b.ibb) + r.wobaHB*b.hp + r.woba1B*(b.h-b.d-b.t-b.hr) +
       r.woba2B*b.d + r.woba3B*b.t + r.wobaHR*b.hr)
       /(b.ab+b.bb-b.ibb+b.sf+b.hp),3) as woba
    , round(((@woba-r.woba)/r.wOBAscale)*@PA,1) as wRAA
    , round((((@woba-r.woba)/r.wOBAscale)+(lro.totr/lro.totpa))*@PA,1) as wRC
    /* NOT yet modified for OOTP and MySQL
    , ((([wRAA]/[PA] + l.RperPA) + (l.RperPA - t.bpf*l.RperPA))/(lb.wRC/lb.PA))*100 AS [wRC+]
    */
    FROM 
      players_career_batting_stats b 
      INNER JOIN leagues l ON b.league_id=l.league_id 
      INNER JOIN teams t ON b.team_id=t.team_id
      INNER JOIN tblRunValues2 r ON b.year=r.year AND b.league_id=r.league_id
      INNER JOIN vLeagueRunsPerOut lro ON b.year=lro.year AND b.league_id=lro.league_id
    WHERE b.ab<>0 AND b.split_id=1
    ORDER BY b.player_id, b.year

Batting Stats 8: wRC

Weighted Runs Created (wRC) is a counting stat that sums the weighted total offensive production of a player in terms of runs.  It tells you how many runs a player produced through his offense in a given time period.  It’s based on the same linear weights as wOBA.  In fact, it’s derived completely from wOBA.  Essentially, it takes the difference between the player’s wOBA and the league wOBA divided by the wOBAscale, plus the league runs per PA, and multiplies it by the player’s plate appearances:

round((((@woba-lg_woba)/wOBAscale)+(lg_.totr/lg_.totpa))*PA,1)

The results I got in my testing were pretty good.  This makes some sense as my wOBA results are also pretty good, and this stat is derived from that one.  Still, it’s a bit of a relief.  I decided that my testing threshold for this stat would be 3.5.  Fangraphs suggests that increments of 10 wRC separate the awful from the poor, and the average from the above average.  I figure that by keeping my threshold well below half of that, I can be reasonably confident that my stats will be in the right neighborhood as those the game generates.

I’m really pretty pleased with this.  Note that this particular sample is pretty representative, and doesn’t contain any players that are at the top of the league.  The fringy cases may still be problematic, but I am good to move on.

The script as it stands now:

#Calculated batting stats for OOTP
    DROP TABLE IF EXISTS CalcBatting;
    CREATE TABLE IF NOT EXISTS CalcBatting AS

    SELECT b.year
    , b.league_id
    , b.player_id
    , b.stint #We can eventually move this down the list
    , b.split_id #We can eventually remove
    , b.team_id #We can eventually move this down the list
    , l.abbr as Lg
    , t.abbr as Team
    , b.g
    , b.ab
    , @PA := b.ab+b.bb+b.sh+b.sf+b.hp AS PA
    , b.r 
    , b.h
    , b.d
    , b.t
    , b.hr
    , b.rbi
    , b.sb
    , b.cs
    , b.bb
    , b.k
    , b.ibb
    , b.hp
    , b.sh
    , b.sf
    , b.gdp
    , b.ci
    , @BA := round(b.h/b.ab,3) AS ba
    , round(b.k/@PA,3) as krate
    , round((b.bb)/@PA,3) as bbrate
    , @OBP := round((b.h + b.bb + b.hp)/(@PA-b.sh-b.ci),3) AS obp
    , round(100*(@OBP/r.woba),0) as OBPplus
    , @SLG := round((b.h+b.d+2*b.t+3*b.hr)/b.ab,3) as slg
    , round(@OBP+@SLG,3) as ops
    , round(@SLG-@BA,3) as iso
    , round((b.h-b.hr)/(b.ab-b.k-b.hr+b.sf),3) as babip
    , @woba := round((r.wobaBB*(b.bb-b.ibb) + r.wobaHB*b.hp + r.woba1B*(b.h-b.d-b.t-b.hr) +
       r.woba2B*b.d + r.woba3B*b.t + r.wobaHR*b.hr)
       /(b.ab+b.bb-b.ibb+b.sf+b.hp),3) as woba
    , round((((@woba-r.woba)/r.wOBAscale)+(lro.totr/lro.totpa))*@PA,1) as wRC
    /* NOT yet modified for OOTP and MySQL
    , ((([wRAA]/[PA] + l.RperPA) + (l.RperPA - t.bpf*l.RperPA))/(lb.wRC/lb.PA))*100 AS [wRC+]
    */
    FROM 
      players_career_batting_stats b 
      INNER JOIN leagues l ON b.league_id=l.league_id 
      INNER JOIN teams t ON b.team_id=t.team_id
      INNER JOIN tblRunValues2 r ON b.year=r.year AND b.league_id=r.league_id
      INNER JOIN vLeagueRunsPerOut lro ON b.year=lro.year AND b.league_id=lro.league_id
    WHERE b.ab<>0 AND b.split_id=1
    ORDER BY b.player_id, b.year

Batting Stats 7: wOBA Corrected

I think I found one of the culprits: Intentional Walks.  Our two guys, Sink and McDonough, have IBB’s in that season well over the average.  I queried the database to find their IBB’s as well as league average for those seasons:

SELECT b.player_id
  , concat(p.first_name, " ", p.last_name) as player 
  , l.abbr
  , b.year
  , b.league_id as league 
  , b.ibb
  , x.lg_avg_ibb
  , b.ibb-x.lg_avg_ibb as diff
FROM (  	
	SELECT b.year
	 , b.league_id
	 , round(avg(b.ibb),0) as lg_avg_ibb
	FROM CalcBatting b
	INNER JOIN players p ON b.player_id-p.player_id
	WHERE b.ab>200 AND p.position<>1 
	GROUP BY b.year, b.league_id
	) as x
  INNER JOIN CalcBatting b ON x.year=b.year AND x.league_id=b.league_id
  INNER JOIN players p ON b.player_id=p.player_id
  INNER JOIN leagues l ON b.league_id=l.league_id
 WHERE b.player_id IN (2574, 472) AND b.year=2015;

You’ll see that both were well above average:

Sink is about 1 standard deviation above average while McDonough is off the charts at almost 6.

I took a look at the 10 highest IBB variances to see if we can say with any certainty that we’re handling IBB’s incorrectly:

Looking at this, I think it’s pretty clear that if nothing else, IBB is at least contributing to the differences in wOBA by weighting it too highly.

It turns out that it’s not necessarily the weighting, but rather for formula for wOBA itself.  I removed IBB from the formula and recalculated wOBA for these 10 player-seasons:

Every single one of them improved.  Not yet perfect, but now that I have the outliers at least within the medium range, I feel comfortable moving on for real.  So, here’s the revised script for CalcBatting with the corrected formula for wOBA:

#Calculated batting stats for OOTP
    DROP TABLE IF EXISTS CalcBatting;
    CREATE TABLE IF NOT EXISTS CalcBatting AS

    SELECT b.year
    , b.league_id
    , b.player_id
    , b.stint #We can eventually move this down the list
    , b.split_id #We can eventually remove
    , b.team_id #We can eventually move this down the list
    , l.abbr as Lg
    , t.abbr as Team
    , b.g
    , b.ab
    , @PA := b.ab+b.bb+b.sh+b.sf+b.hp AS PA
    , b.r 
    , b.h
    , b.d
    , b.t
    , b.hr
    , b.rbi
    , b.sb
    , b.cs
    , b.bb
    , b.k
    , b.ibb
    , b.hp
    , b.sh
    , b.sf
    , b.gdp
    , b.ci
    , @BA := round(b.h/b.ab,3) AS ba
    , round(b.k/@PA,3) as krate
    , round((b.bb)/@PA,3) as bbrate
    , @OBP := round((b.h + b.bb + b.hp)/(@PA-b.sh-b.ci),3) AS obp
    , round(100*(@OBP/r.woba),0) as OBPplus
    , @SLG := round((b.h+b.d+2*b.t+3*b.hr)/b.ab,3) as slg
    , round(@OBP+@SLG,3) as ops
    , round(@SLG-@BA,3) as iso
    , round((b.h-b.hr)/(b.ab-b.k-b.hr+b.sf),3) as babip
    , round((r.wobaBB*(b.bb-b.ibb) + r.wobaHB*b.hp + r.woba1B*(b.h-b.d-b.t-b.hr) +
       r.woba2B*b.d + r.woba3B*b.t + r.wobaHR*b.hr)
       /(b.ab+b.bb-b.ibb+b.sf+b.hp),3) as woba
    /* NOT yet modified for OOTP and MySQL
    , round((([woba]-r.lgwoba)/r.wobascale)*[PA],1) as wRAA
    , round(((([woba]-r.lgwoba)/r.wobascale)+(l.totr/l.totpa))*[PA],0) as wRC
    , ((([wRAA]/[PA] + l.RperPA) + (l.RperPA - t.bpf*l.RperPA))/(lb.wRC/lb.PA))*100 AS [wRC+]
    */
    FROM 
      players_career_batting_stats b 
      INNER JOIN leagues l ON b.league_id=l.league_id 
      INNER JOIN teams t ON b.team_id=t.team_id
      INNER JOIN tblRunValues2 r ON b.year=r.year AND b.league_id=r.league_id
    WHERE b.ab<>0 AND b.split_id=1
    ORDER BY b.player_id, b.year

Batting Stats 6: Follow-up on wOBA

I wasn’t totally comfortable with the testing results for wOBA.  I didn’t change the calculation, but I did change my testing methodology.  The first time around was totally un-scientific:  I typed random player_id’s into a query, looked for stats that felt ‘good’ (whatever that means), grabbed their names and searched for the stats in the game.

I revisited my method several times.  Since my biggest misses in the last test were with small sample sizes and, since I don’t really care about stats with small sample sizes, I limited my sample to player-years with at least 200 AB’s.  Perhaps redundantly, I removed pitchers from the sample as well.  Thinking that there would be an easy way to grab the stats from OOTP, I limited the sample to the last two game years. (There’s not, really.)  Finally, rather than cherry picking records that passed some indefinable sniff-test, I assigned a random number to each record, sorted my results by that random number, and limited the result set to 30.  Then, I went into the game and looked up the stats for those player-years.

Here’s the query that did that:

SELECT rand() as sorted
  , p.player_id
  , concat(first_name, " ", last_name) as player
  , b.year
  , l.abbr 
  , b.ab
  , b.woba
FROM players p
  INNER JOIN CalcBatting b ON p.player_id=b.player_id
  INNER JOIN leagues l ON b.league_id=l.league_id
WHERE p.position<>1 AND b.year>=2014 AND b.ab>200
ORDER BY sorted
LIMIT 30;

I am much more comfortable with this result set:

77% were within range and 3 of the 5 that were medium were within 3 points of being good.  The two bad ones were really bad, though.  I will take a look at those particular seasons to see if I can find any stats that are out of normal range.  My thought is that somehow I am weighting a less-common stat (such as hit by pitch) incorrectly.  For a normal season, if HBP is weighted wrong but only occurs 5 times in 600 plate appearances, no big deal.  But if a guy got plunked 50 times or something, it could really throw off the calculation.  That’s my working theory going in at least.  If I find something weird, I will report back.  Otherwise, I’m going to move on.

Batting Stats 5: wOBA

Here it is- wOBA.  I’ve been building up to this for a while and I’m excited to get going.  For the full details on wOBA, check out Fangraphs‘ page.  The short version is that wOBA takes On Base Percentage and weights each of the components based on the run expectancy for each of the component events in each league year.  It’s a very good measure of the offensive contribution of a player in a specific league setting.  It does not normalize the data across years and ballparks – we will get to some stats that do that later on.  But it does acknowledge that a double is worth more than a walk – and it quantifies how much more.

The formula takes the woba weights that we calculated in Run Table 2 and multiplies them by the number of walks, singles, doubles, etc. that a player accumulated during a season:

, round((r.wobaBB*b.bb + r.wobaHB*b.hp + r.woba1B*(b.h-b.d-b.t-b.hr) +
       r.woba2B*b.d + r.woba3B*b.t + r.wobaHR*b.hr)
       /(b.ab+b.bb-b.ibb+b.sf+b.hp),3) as woba

I’ve already plugged this formula into the table and I’m getting results.  Before I start comparing my results to the game and pulling my hair out, I am going to establish success criteria.  I mentioned in the Run Environment posts that our formulae contain constants that I don’t fully understand.  Moreover, I don’t know if the game uses the same constants.  So, to expect a perfect match between my database and the game may be unreasonable.

Let’s think about what a reasonable margin of difference would be.  Any difference less than 5 points is negligible.  I can’t really make a distinction between two players with wOBAs of .320 and .325.  They’re essentially equal.  Can I say the same thing about 20 points?  No.  There’s clearly a difference between .320 and .340.  What about 10?  Iffy.  If I want to give myself as much wiggle room as possible without losing faith in my metrics, I can’t go higher than 10.  Let’s set 10 as the absolute limit and see what we get.   I am going to take a sample of 35 player years and see how close I get:

So, that’s 21 where the difference is less than 10 points.  7 where the difference is between 10 and 20 points, and 7 where the difference is greater than 20.  Not perfect – 40% outside of the range I set for myself.  However, still better than I feared, and actually better than it looks.  For example, Dan Wasielewski’s 2012 season is off by 20 points.  But that 2012 season only had 6 plate appearances.  I am totally OK having wacky numbers for very small sample sizes.  The greatest difference (.029), similarly came from a season with barely over 100 plate appearances.

Without digging in to how to determine my own constants and without being able to pick Markus’s or Matt’s brains on how they derive these formulae, I think I’m pretty OK with my stats.  Until I decide to relentlessly pursue perfection, that is…

Batting Stats 4: BABIP

We’ll look at one last easy stat before the hard stuff.

BABIP is a very interesting stat because it acts as a kind of sanity check on all of the other stats a player produces.  It shows Batting Average on Balls in Play and average in most professional leagues is around .300.  A player with a career average well above that generally hits the ball hard.  A player with a career BABIP well below that mark often makes weak contact.  That’s an OK measure in itself, but it serves a much better purpose. If, during the course of a season, a player is putting up exceptional numbers, checking his BABIP against his career average will give you a sense of whether he’s getting lucky and likely to regress or if he’s turned a corner and upped his game.

The formula for BABIP is pretty straightforward:
(H - HR) / (AB - K - HR + SF)

The whole Megillah up to this point, then, is:

#Calculated batting stats for OOTP
    DROP TABLE IF EXISTS CalcBatting;
    CREATE TABLE IF NOT EXISTS CalcBatting AS

    SELECT b.year
    , b.league_id
    , b.player_id
    , b.stint #We can eventually move this down the list
    , b.split_id #We can eventually remove
    , b.team_id #We can eventually move this down the list
    , l.abbr as Lg
    , t.abbr as Team
    , b.g
    , b.ab
    , @PA := b.ab+b.bb+b.sh+b.sf+b.hp AS PA
    , b.r 
    , b.h
    , b.d
    , b.t
    , b.hr
    , b.rbi
    , b.sb
    , b.cs
    , b.bb
    , b.k
    , b.ibb
    , b.hp
    , b.sh
    , b.sf
    , b.gdp
    , b.ci
    , @BA := round(b.h/b.ab,3) AS ba
    , round(b.k/@PA,3) as krate
    , round((b.bb)/@PA,3) as bbrate
    , @OBP := round((b.h + b.bb + b.hp)/(@PA-b.sh-b.ci),3) AS obp
    , round(100*(@OBP/r.woba),0) as OBPplus
    , @SLG := round((b.h+b.d+2*b.t+3*b.hr)/b.ab,3) as slg
    , round(@OBP+@SLG,3) as ops
    , round(@SLG-@BA,3) as iso
    , round((b.h-b.hr)/(b.ab-b.k-b.hr+b.sf),3) as babip
    /* NOT yet modified for OOTP and MySQL
    , round((r.wobaBB*nz(b.bb,0) + r.wobahb*nz(b.hbp,0) + r.woba1b*(b.h-b.[2b]-b.[3b]-b.hr) + r.woba2b*b.[2b] + r.woba3b*b.[3b] + r.wobahr*b.hr)/(b.ab+nz(b.bb,0)-nz(b.ibb,0)+nz(b.sf,0)+nz(b.hbp,0)),3) as woba
    , round((([woba]-r.lgwoba)/r.wobascale)*[PA],1) as wRAA
    , round(((([woba]-r.lgwoba)/r.wobascale)+(l.totr/l.totpa))*[PA],0) as wRC
    , ((([wRAA]/[PA] + l.RperPA) + (l.RperPA - t.bpf*l.RperPA))/(lb.wRC/lb.PA))*100 AS [wRC+]
    */
    FROM 
      players_career_batting_stats b 
      INNER JOIN leagues l ON b.league_id=l.league_id 
      INNER JOIN teams t ON b.team_id=t.team_id
      INNER JOIN tblRunValues2 r ON b.year=r.year AND b.league_id=r.league_id
    WHERE b.ab<>0 AND b.split_id=1
    ORDER BY b.player_id, b.year

 

Batting Stats 3: SLG, OPS, and ISO

A short post for three pretty straightforward rate stats: Slugging Percentage, On Base Plus Slugging, and Isolated Power.

All three are measures of a batter’s hitting for power.  Keeping with my tradition here of not explaining things that I understand, I will get straight to the point.  Seeking to keep things simple and readable, and wanting to show off my mad variable skills, I set them all as variables, including batting average.  The result is:

DROP TABLE IF EXISTS CalcBatting;
CREATE TABLE IF NOT EXISTS CalcBatting AS

   SELECT b.year
    , b.league_id
    , b.player_id
    , b.stint #We can eventually move this down the list
    , b.split_id #We can eventually remove
    , b.team_id #We can eventually move this down the list
    , l.abbr as Lg
    , t.abbr as Team
    , b.g
    , b.ab
    , @PA := b.ab+b.bb+b.sh+b.sf+b.hp AS PA
    , @BA := round(b.h/b.ab,3) AS ba
    , round(b.k/@PA,3) as krate
    , round((b.bb)/@PA,3) as bbrate
    , @OBP := round((b.h + b.bb + b.hp)/(@PA-b.sh-b.ci),3) AS obp
    , round(100*(@OBP/r.woba),0) as OBPplus
    , @SLG := round((b.h+b.d+2*b.t+3*b.hr)/b.ab,3) as slg
    , round(@OBP+@SLG,3) as ops
    , round(@SLG-@BA,3) as iso
FROM 
      players_career_batting_stats b 
      INNER JOIN leagues l ON b.league_id=l.league_id 
      INNER JOIN teams t ON b.team_id=t.team_id
      INNER JOIN tblRunValues2 r ON b.year=r.year AND b.league_id=r.league_id
    WHERE b.ab<>0 AND b.split_id=1
    ORDER BY b.player_id, b.year

Continue reading “Batting Stats 3: SLG, OPS, and ISO”

The Tools I’m Using

Just some thoughts about the tools and services I am using for my project.

Up until the middle of the batting stats table, I had been running MySQL locally on my late 2012 MacBook Pro 15inch Retina, maxed out with the 2.6 i7 and 16GB of RAM.  It’s still a great machine.

For a while, I had been using MySQL Workbench as my interface.  It’s not bad at all and, for a free tool, it’s great.  It does have some shortcomings in terms of ease of use and importing data.  For example, when using it to connect to a remote database, it insists on treating it like a localhost.  Honestly, after a few hours of live chats with tech support trying to make it work, I gave up on it.

I switched to Querious and I couldn’t be happier.  Not cheap, but it’s a joy to work with.  My favorite feature?  That would have to be the ability to easily import from csv.  Being able to do it at all is pretty fantastic, but being able to clean and manipulate the data easily before import is amazing.  On top of that, the import speeds are blazing fast.

Querious has 4 main views: Content, Structure, Query, and Server.

Content allows you to view the content of tables and views just by clicking on an object in the sidebar.  From within that view, you can filter the results shown by clicking on the “show all’ button.  This pulls down a list of all the columns in the table and allows you to write a quick WHERE clause to filter the data.  A real time saver when you need to do a quick sanity check.

Clicking on the database itself rather than a table gives you a view of every table and view along with data size, index size, auto_increment count, and row count.  My only quibble with this is that the row count does not update automatically.  You have to refresh for each table to see a value.

Structure shows you the structure of the database and the tables.  Clicking on the database will show you a diagram of all the tables, with lines showing foreign keys and relationships between tables.  Clicking on a  table will show you all of the columns, their types, any constraints.  You can switch tabs to see primary keys, indexes, and foreign keys.  You can also add, change, or delete these from this view.  Other tabs show you what triggers exist for tables and allow you to create or change them.  Finally, you can see table info and options – along with an exportable CREATE statement.

The Query view gives you a SQL editor with all the usual bells and whistles: context sensitive highlighting, an intelligent (much more so than Workbench) auto complete feature.  There’s also a context sensitive table reference panel that senses which table you’re referring to and lists all of the columns.  Double clicking on a column will insert the name at the cursor.  You can also switch to a different table if you’re thinking quicker than the AI.

You can also open or create SQL files in a separate window that has all of the same features except for the table reference panel.    The look and feel of both the editors are great, but for reasons mentioned below, I am still doing most of my editing in a different program.

My only complaint at all about Querious is that it seems to be frightened of large sql files.  My 10MB Test Data Load script has never failed to hang the program (beachball) necessitating a force-quit.  Even loading a 1000 line script hangs the program for at least a minute.  That’s loading, not executing.  The way around the test data load is to import the data via csv.  It handles that like a pro.

Still, as mentioned above, I do most of my editing in a separate program.  I use Brackets because the editing environment is almost as good as Querious and the file management is far better.  Perhaps because it was made with web design in mind, it makes it much easier to manage a collection of files, keeping them straight, organized, and easily accessible.  Querious allows you to save sql files and even allows you to put them in folders and favorites, but I can’t seem to get comfortable with the function.  With Brackets, I know where my files are and can flip back and forth with ease.

My remote database is hosted by DreamHost.  I have a number of domains hosted with them and really like them.  Their pricing is fair and have a number of goodies that make setting up portfolios and blogs, as well as provisioning databases, simple.  Their customer service is great.  If you’re interested in setting up hosting with them, ask me for a referral!