You are on page 1of 126

OBP

On-Base Percentage (OBP) measures the most important thing a batter can do at the plate: not make an out. Since a team only gets 27 outs per game, making outs at a high rate isnt a good thing that is, if a team wants to win. Players with high on-base percentages avoid making outs and reach base at a high rate, prolonging games and giving their team more opportunities to score. The formula for OBP is simple:

OBP has become synonymous with the book Moneyball because at in the early 2000s, teams werent properly valuing players with high OBPs and the Oakland As could swipe talented players for cheap. These days, every team has come to accept how vitally important OBP is to their success, and that particular market inefficiency has been closed. Context: Please note that the following chart is meant as an estimate, and that league-average OBP varies on a year-by-year basis. To see the league-average OBP for every year from 1901 to the present, check the FanGraphs leaderboards.

Rules of Thumb
Rating Excellent Great Above Average Average Below Average Poor Awful OBP 0.400 0.370 0.340 0.320 0.310 0.300 0.290

Things to Remember: OBP is considered more accurate than Batting Average in measuring a players offensive value, since it takes into account hits and walks. A player could bat over .300, but if they dont walk at all, theyre not helping their team as much as a .270 hitter with a .380 OBP. A players OBP is a good predictor of their future OBP after 500 plate appearances. So if Pujols has a .500 OBP after only 50 plate appearances, dont expect him to continue reaching base at that rate.

OPS and OPS+


On-base Plus Slugging (OPS) is exactly what it sounds like: the sum of a players on-base percentage and their slugging percentage. Many saberists dont like OPS because it treats OBP as equal in value with SLG%, while OBP has been proven to be roughly twice as important as SLG% in scoring runs (x1.8 to be exact). However, OPS has value as a metric because it is the only widely accepted statistic that accounts for all the different aspects of offense: contact, patience, and and power. You can find OPS on baseball cards and in broadcasts, and its a simple statistic for regular baseball fans to understand. On-base Plus Slugging Plus (OPS+) has not gained as much widespread acceptance, but is a more informative metric than OPS. This statistic normalizes a players OPS it adjusts for small variables that might affect OPS scores (e.g. park effects) and puts the statistic on an easy-to-understand scale. A 100 OPS+ is league average, and each point up or down is one percentage point above or below league average. In other words, if a player had a 90 OPS+ last season, that means their OPS was 10% below league average. Since OPS+ adjusts for league and park effects, its possible to use OPS+ to compare players from different years and on different teams. Context: Please note that the following chart is meant as an estimate, and that league-average OPS varies on a year-by-year basis. To see the league-average OPS for every year from 1901 to the present, check the FanGraphs leaderboards.

League-average OPS+ is always 100. Rating Excellent Great Above Average Average Below Average Poor Awful Things to Remember: If youre looking to evaluate a players offense, OPS is a better metric to use than batting average, but should always be used in conjunction with other statistics as well. Its a good gateway statistic to get people thinking beyond the traditional statistics. If you have the choice, use Weighted On-Base Average (wOBA) instead of OPS. OPS weighs both OBP and SLG% the same, while wOBA accounts for the fact that OBP is actually more valuable. Since it provides context and adjusts for park and league effects, OPS+ is better to use than straight OPS, especially if youre comparing statistics between seasons. OPS 1.000 0.900 0.800 0.730 0.700 0.600 0.550

A Visual Look at wOBA


by Steve Slowinski - April 18, 2011 If youre any sort of saberist, you should already know that Weighted On-Base Average (wOBA) is vastly superior to On-Base Plus Slugging (OPS) at measuring offensive value. While OPS is a mishmash statistic, throwing together OBP and SLG for kicks and giggles, wOBA was created based on research on the historical run values of events. It weighs all the different aspects of hitting in proportion to their actual, real-life value to a teams offense. But how exactly do these two statistics differ in assigning value to events? See for yourself:

What you see in that chart is a representation of how much wOBA and OPS weigh each individual outcome. The wOBA coefficients are very easy to find and straightforward, but I had to take some shortcuts to come up with coefficient values for OPS. While straightforward in theory the sum of OBP and SLG OPS is actually a rather convoluted statistic. You want to try adding these two stats together?

Instead of tangling with all that, I took the shortcut of just assuming both statistics had the same denominator and calculated the coefficients that way. Its good enough for an estimate, and it gets the point across in the visual. So this is another area that wOBA trumps OPS: simplicity. As you can see from the visual, wOBA puts more stress on walks, hit by pitches, and singles, while OPS attaches a huge value to homeruns and triples.* Since OPS is calculated by adding OBP and SLG, many people believe it treats both power and on-base skills as equally important, but thats simply not true. When you dig down into the actual values OPS attaches to each outcome, it still favors power hitters by a wide margin.

*OPS also ignores Reached On Errors (ROE), but these happen so infrequently it isnt a huge concern. Also, when you look at OPS like this, doesnt it seem slightly ridiculous? How can we treat it as a serious statistic when its coefficients look like they were created by a third grader? Ill stick with the one backed by research and history, thank you very much.

wOBA
Weighted On-Base Average (wOBA) is one of the most important and popular catch-all offensive statistics. It was created by Tom Tango (and notably used in The Book) to measure a hitters overall offensive value, based on the relative values of each distinct offensive event. wOBA is based on a simple concept: Not all hits are created equal. Batting average assumes that they are. On-base percentage does too, but does one better by including other ways of reaching base. Slugging percentage weights hits, but not accurately (Is a double worth twice as much as a single? In short, no). On-base plus slugging (OPS) does attempt to combine the different aspects of hitting into one metric, but it assumes that one percentage point of SLG is the same as that of OBP. In reality, a handy estimate is that OBP is around twice as valuable than SLG (the exact ratio is x1.8). Weighted On-Base Average combines all the different aspects of hitting into one metric, weighting each of them in proportion to their actual run value. While batting average, on-base percentage, and slugging percentage fall short in accuracy and scope, wOBA measures and captures offensive value more accurately and comprehensively. The wOBA formula for the 2012 season was: wOBA = (0.691uBB + 0.722HBP + 0.8841B + 1.2572B + 1.5933B + 2.058HR) / (AB + BB IBB + SF + HBP) These weights change on a yearly basis, so you can find the specific wOBA weights for every year from 1871 to 2010 here. Context: Please note that the following chart is meant as an estimate, and that league-average wOBA varies on a year-by-year basis. It is set to the same scale as OBP, so league-average wOBA in a given year should be very close to the league-average OBP. To see the league-average wOBA for every year from 1901 to the present, check the FanGraphs leaderboards.

Rules of Thumb
Rating Excellent Great Above Average Average wOBA 0.400 0.370 0.340 0.320

Below Average Poor Awful

0.310 0.300 0.290

Things to Remember: This stat accounts for the following aspects of hitting: unintentional walks, hit-by-pitches, singles, doubles, triples, homeruns. Stolen-bases and caught stealing numbers are also sometimes included. One reason to leave SB and CS out of the equation is if you are using wOBA to determine an ideal batting lineup. Exactly how much to weigh each of the components of wOBA was determined using linear weights. wOBA can be converted into offensive runs above average easily. These are called Weighted Runs Above Average (wRAA). The formula to convert wOBA into wRAA is listed below:

wRAA = ((wOBA league wOBA) / wOBA scale) PA (league-average wOBA can be found here; wOBA scale values can be found here) This stat is context-neutral, meaning it does not take into account if there were runners on base for a players hit or if it was a close game at the time. wOBA on FanGraphs is not adjusted for park effects, meaning that batters that play in hitter-friendly parks will have slightly inflated wOBAs.

wRAA
Weighted Runs Above Average (wRAA) measures the number of offensive runs a player contributes to their team compared to the average player. How much offensive value did Evan Longoria contribute to his team in 2009? With wRAA, we can answer that question: 28.3 runs above average. A wRAA of zero is league-average, so a positive wRAA value denotes aboveaverage performance and a negative wRAA denotes below-average performance. This is also a counting statistic (like RBIs), so players accrue more (or fewer) runs as they play. Calculating wRAA is simple if you have a players wOBA value: subtract the league average wOBA from your players wOBA, divide by the wOBA scale coefficient (1.26 for 2011), and multiply that result by how many plate appearances the player received. wRAA = ((wOBA league wOBA) / wOBA scale) PA You can find wOBA scale values for any year from 1871-2010 here, and league-average wOBA for every year can be found on the FanGraphs leaderboards. The exact wOBA scale value varies on a year-to-year basis in order to set wOBA on the same scale as league-average OBP. Also, if youre feeling ambitious, its possible to calculate wRAA using linear weights. Context: Please note that the following chart is meant as an estimate. No matter the year, this statistic will always have 0 wRAA as league-average. Rating Excellent Great Above Average Average Below Average Poor Awful Things to Remember: wRAA is league adjusted, meaning you can use it to compare players from different leagues and years. wRAA 40 20 10 0 -5 -10 -20

When calculating Wins Above Replacement (WAR), wRAA is used to represent offensive ability. Ten wRAA is equal to +1 win.

BABIP

Batting Average on Balls In Play (BABIP) measures how many of a batters balls in play go for hits. While typically around 30% of all balls in play fall for hits, there are three main variables that can affect BABIP rates for individual players: a) Defense - Say a player cracks a hard line drive down the third base line. If an elite fielder is playing at third, they may make a play on it and throw the runner out. However, if theres a dud over there with limited range, the ball could just as easily fly by for a hit. Players have no control over the defenses theyre facing, and they can only direct their hits to a limited extent. Sometimes a batter can be making good contact, but is simply hitting balls right at fielders. Also, a batter that consistently hits into a shift may have a lower BABIP than a typical player. b) Luck - Sometimes, even against a great defense, bloop hits can fall in. A batter may turn a nasty pitch into a dribbler that just sneaks past the first baseman, or they may blast a shot in the gap that a fielder makes a diving catch on. Hits can fall in despite the best pitches and the best defenses thats just the game. c) Changes in Talent Level - Over the course of a season, players can go through periods of adjustment. Maybe pitchers adjust to a weakness that a batter has, and the batter starts making less solid contact and getting fewer hits. Maybe a batter is simply on fire for a season, playing at a very high talent level and roping hard line drives all over the field. The harder a ball is hit, the more likely it is to fall in for a hit. Due to this flakiness, BABIP can dramatically affect a hitters batting average. If a large number of a batters balls in play go for hits, that can boost their batting average quite high. Similarly, if a large number of balls in play get caught, it can reduce a players total offensive value. If a player has a very high or very low BABIP, it means that whatever the reason for the spike (whether its defense, luck, or slight skill), that player will regress back to their career BABIP rate. BABIP rates are flaky and prone to vary wildly from year to year, so we should always take any extreme BABIP rates with a grain of salt. Context: The average BABIP for hitters is around .290 to .310. If you see any player that deviates from this average to an extreme, theyre likely due for regression. However, hitters can influence their BABIPs to some extent. For example, speedy hitters typically have high career BABIP rates (like Ichiro and his .357 career BABIP), so dont expect all players to regress to league average instead, look at a players career BABIP

rate. And if you want something more exact, try The Hardball Times xBABIP (expected BABIP) calculator, available at the top of the page. Things to Remember: Saying a player will regress is a tricky statistical subject that confuses many people. See our section on regression for more info. Line drives go for hits more often than groundballs, and groundballs go for hits more often than flyballs.

Batting average on balls in play


From Wikipedia, the free encyclopedia Jump to: navigation, search

In baseball statistics, Batting average on balls in play (abbreviated BABIP) measures how many of a batters balls in play go for hits, or how many balls in play against a pitcher go for hits, excluding homeruns.[1] BABIP is commonly used as a red flag in sabermetric analysis, as a consistently high or low BABIP is hard to maintain - much more so for pitchers than hitters. Therefore, BABIP can be used to spot fluky seasons by pitchers, as with other statistical measures; those pitchers whose BABIPs are extremely high can often be expected to improve in the following season, and those pitchers whose BABIPs are extremely low can often be expected to regress in the following season. A normal BABIP is around .300, though the baseline regression varies depending on number of factors including the quality of the team's defense (e.g. a team with an exceptionally bad defense might yield a BABIP as high as .315) and the pitching tendencies of the pitcher (for instance, whether he is a groundball or flyball pitcher).[2][3] While a pitcher's BABIP may go up and down in an individual season, there are distinct differences between pitchers' career averages. The equation for BABIP is:

where H is hits, HR is home runs, AB is at bats, K is strikeouts, and SF is sacrifice flies.


Offense Defense Pitching Win Probability Principles PITCHF/x WAR Business

ISO
Isolated Power (ISO) is a measure of a hitters raw power. Or, to look at it another way, it measures how good a player is at hitting for extra bases. The simplest way to calculate ISO is to subtract a players Batting Average from their Slugging Percentage, which leaves us with a measure of just a players extra bases per at bat. If you prefer, you can also calculate ISO this way: ISO = ((2B) + (2*3B) + (3*HR)) / AB ISO = Extra Bases / At-Bats It takes a long time for a players ISO to have predictive power going forward; a sample size of 550 plate appearances is recommended to draw any conclusions. In other words, if Albert Pujols has a .550 ISO two weeks into the season, its way too early to expect that to continue. Context: Please note that the following chart is meant as an estimate, and that league-average ISO varies on a year-by-year basis. To see the league-average ISO for every year from 1901 to the present, check the FanGraphs leaderboards.
Rating Excellent Great Above Average Average Below Average Poor Awful ISO 0.250 0.200 0.180 0.145 0.120 0.100 0.080

Offense Defense Pitching Win Probability Principles PITCHF/x WAR Business

HR/FB

Home Run to Fly Ball rate (HR/FB) is the ratio of home runs a player hits out of their total number of fly balls. While a players raw total of home runs will tell you something, their HR/FB ratio can be useful in providing context about how sustainable their power is. For example, say there is a player that typically hits around 30 home runs a season, but last year they only hit 20. As a fan, you want to know why that drop in production happened and if theres something to worry about. Was the player still hitting the same about of fly balls but with a lower HR/FB rate? This could imply that the player lost a touch off their power, which could be a result of an injury or the tell-tale sign of an aging slugger. Or did the player still have the same HR/FB rate, but he was hitting fewer fly balls? If a player goes from hitting fly balls to ground balls, that could be attributed to contact issues. But if the player started to hit more line drives, they may have sacrificed some home runs for more doubles and overall hits. Context: Please note that the following chart is meant as an estimate, and that league-average HR/FB rate varies on a year-by-year basis. To see the league-average HR/FB rate for every year from 2002 to the present, check the FanGraphs leaderboards.
Rating Excellent Great HR/FB 20.0% 15.0%

Above Average 12.5% Average Below Average Poor Awful 9.5% 7.5% 5.0% 1.0%

Good home run hitters typically have HR/FB ratios anywhere from 15-20%, while weaker players have ratios that range as low as 1%. Things to Remember: Different parks can result in different HR/FB numbers for hitters. For example, a right handed hitter will have a higher HR/FB rate in Fenway Park than in PETCO Park. Use this statistic in conjunction with other batted ball statistics. Its useful, but mostly when put in the context of a players overall hitting profile. Though more rare, home runs can also come off of line drives. This statistic is more important for evaluating pitchers than for hitters.

Park Adjustments
These numbers are difficult to calculate and I would refer you to a copy of Total Baseball if you wish to recreate the park factor values. The value indicate a number above 100 is a park good for hitters and below 100 is a park good for pitchers. The ERA+ and PRO+ values are adjusted to both the league average and the park the pitcher or batter played in. The career totals are gathered by finding what a league average player would have done given the same playing time as the player in question and then summing these values up over the player's career. Given how I store the seasonal data (as an ER total not the league ERA) it is very easy to calculate. Similarly for PRO+ (league times on base and league total bases). Note that the lg_ERA and lg_OPS values are for a league average player in that ballpark for single season data, and for a league average player with the same career path as the given player. This means that two players from the same league will have different values here if they played in different parks.

Calculation of Park Factors


I largely follow the method spelled out below. Historically, B-R has been using single-year park factors for recent years and 3-year park factors historically. I have changed that to now use 3-year factors by default for all years. Of course, the current season is only really a 2-year factor. The current year and last year. This can lead to some big changes in the numbers, from what had been on the site. Two other major differences. 1) Interleague games are not used in the calculation. They really mess things up because in some games the teams have the DH and in others they don't. These series are also typically not home and home series. 2) For the years 1957 and on, I use runs per 27 outs used rather than runs per game (the IPC is always 1.00 in these cases). This is more accurate than using IPC. Overall, these two changes make only small changes to the numbers, but I believe them to be more accurate this way. This information is taken from an Archived Copy of the now defunct TotalBaseball.com website. THIS DOES NOT BELONG TO ME AND MAY BE REMOVED IF I AM ASKED TO DO SO BY A REPRESENTATIVE OF TOTALBASEBALL.COM. PARK FACTOR Calculated separately for batters and pitchers. Above 100 signifies a park favorable to hitters; below 100 signifies a park favorable to pitchers. The computation of PF is admittedly daunting, and what follows is probably of interest to the merest handful of readers, but we feel obliged to state the mathematical underpinnings for those few who may care. We use a three-year average Park Factor for players and teams unless they change home parks. Then a two-year average is used, unless the park existed for only one year. Then a one-year

mark is used. If a team started up in Year 1, played two years in the first park, one in the next, and three in the park after that and then stopped play, the average would be as follows (where Fn is the one-year park factor for year n):
Year 1 and 2 = (F1 + F2)/2 Year 3 = F3 Year 4 = (F4 + F5)/2 Year 5 = (F4 + F5 + F6)/3 Year 6 = (F5 + F6)/2

Step 1. Find games, losses, and runs scored and allowed for each team at home and on the road. Take runs per game scored and allowed at home over runs per game scored and allowed on the road. This is the initial figure, but we must make two corrections to it. Step 2. The first correction is for innings pitched at home and on the road. This is a bit complicated, so the mathematically faint of heart may want to head back at this point. First, find the team's home winning percentage (wins at home over games at home). Do the same for road games. Calculate the Innings Pitched Corrector (IPC) shown below. If it is greater than 1, this means the innings pitched on the road are higher because the other team is batting more often in the last of the ninth. This rating is divided by the Innings Pitched Corrector, like so:
(18.5 -- Wins at home / Games at home) IPC = ----------------------------------(18.5 -- Losses on road / Games on road)

Note: 18.5 is the average number of half-innings per game if the home team always bats in the ninth. Step 3. Make corrections for the fact that the other road parks' total difference from the league average is offset by the park rating of the club that is being rated. Multiply rating by this Other Parks Corrector (OPC):
No. of teams OPC=---------------------------------No. of teams - 1 + Run Factor, team

(Note that this OPC differs from that presented earlier in The Hidden Game of Baseball, for in preparing the pre-1900 data for Total Baseball, we discovered that for some parks with extreme characteristics, like Chicago's Lake Front Park of 1884, which had a Home Run Factor of nearly 5, the earlier formula produced wrong results. For parks with factors of 1.5 or less, either formula works well.) Example. In 1982, Atlanta scored 388 runs and allowed 387 runs at home in 81 games, and scored 351 and allowed 315 on the road in 81 games. The initial factor is (775/81) / (666/81) = 1.164. The Braves' home record was 42-39, or .519, and their road record was 47-34, or .580. Thus the IPC = (18.5 - .519) / (18.5 - .420) = .995. The team rating is now 1.164/.995 =

1.170. The OPC = (12) / (12 - 1 + 1.170) = .986. The final runs-allowed rating is 1.170 X .986, or 1.154. We warned you it wouldn't be easy! The batter adjustment factor is composed of two parts, one the park factor and the other the fact that a batter does not have to face his own team's pitchers. The initial correction takes care of only the second factor. Start with the following (SF = Scoring Factor, previously determined [for Atlanta, 1.154], and SF1 = Scoring Factor of the other clubs [NT = number of teams]):
SF - 1 1 - ----NT - 1

Next is an iterative process in which the initial team pitching rating is assumed to be 1, and the following factors are employed: RHT, RAT= Runs per game scored at home (H) and away (A) by team, OHT, OAT= Runs per game allowed at home, away, by team RAL = Runs per game for all games in the league. Now, with the Team Pitching Rating (TPR) = 1, we proceed to calculate Team Batting Rating (TBR):
|RAT RHT| | TPR-1| |--- + ---| |1+ -----| |SF1 SF | | NT- 1| TBR=-----------------------------RAL

|OAT OHT| | TBR-1| |--- + ---| |1+ -----| |SF1 SF | | NT- 1| TPR=-----------------------------RAL

The last two steps are repeated three more times. The final Batting Corrector, or Batters' Park Factor (BPF) is
(SF + SF1) BPF=---------------| | TPR-1 | | |2 X |1+ -----| | | | NT-1 | |

Similarly, the final Pitching Corrector, or Pitchers' Park Factor (PPF) is

(SF + SF1) PPF=---------------| | TBR-1 | | |2 X |1+ -----| | | | NT-1 | |

Now an example, using the 1982 Atlanta Braves once again.


388 RHT - --- - 4.79 81 351 RAT - --- - 4.33 81

387 OHT - --- = 4.78 81

315 OAT - --- = 3.89 81

7947 RAL = --- = 8.18 972

NT

= 12

SF = 1.154

|1.154-1| SF1=1-|-------|=.986 | 11 |

|4.33 4.79| | 1- 1| |--- + ----| |1+ --- | |.986 1.154| | 11 | TBR=---------------------------=1.044 8.18

|3.89 4.78| | 1.044- 1| |--- + ----| |1+ --------| |.986 1.154| | 11 | TBR=---------------------------=.993 8.18

Repeating these steps gives a TBR of 1.04 and a TPR of .97. The Batters' Park Factor is
(1.170 + .986) BPF=----------------=1.07 | | 99-1 | | |2 X |1+ ----| | | | 11 | |

This is not a great deal removed from taking the original ratio,
1.170 + 1 --------- , which is 1.08 2

The Pitchers' Park Factor may be calculated in analogous fashion. To apply the Batters' Park Factor to Batting Runs, one must use this formula:
BR uncorr. BR = -----------------------------------------------------corr. Runs (league) Runs (league) AB+BB+HBP ------------- - ---------- X (BPF - 1) X --------AB+BB+HBP AB+BB+HBP (player or team) (league) (league)

For example, if a player produces 20 runs above average in 700 plate appearances with a Batters' Park Factor of 1.10, and the league average of runs produced per plate appearance is .11, this means that the player's uncorrected Batting Runs is 20 over the zero point of 700 X .11 (77 runs). In other words, 77 runs is the average run contribution expected of this batter were he playing in an average home park. But because his Batters' Park Factor is 1.10, which means his home park was 10 percent kinder to hitters (than the average), you would really expect an average run production of 1.1 X 77, or 85 runs. Thus the player whose uncorrected Batting Runs is 97 with a BF of 1.1 is only +10 runs rather than +20, and 10 is his Park Adjusted Batting Runs (in the Player Register, BR/A): 10 = 20/1.10 - .11 X (1.10 - 1) X 700.

Spd
Speed score (Spd) is a statistic developed by Bill James that rates a player on their speed and baserunning ability. Different locations include slightly different components, but the FanGraphs version consists of, Stolen Base Percentage, Frequency of Stolen Base Attempts, Percentage of Triples, and Runs Scored Percentage. Speed score is a bit of an outdated stat at this point, as it doesnt account for all aspects of baserunning and its results arent presented on a runs above average scale. Instead, if youre looking for a measure of how much value a player adds to their team through their baserunning, check out the statistic Ultimate Base Running (UBR). Context: Please note that the following chart is meant as an estimate, and that league-average Speed Score varies on a year-by-year basis. To see the league-average Spd for every year from 1901 to the present, check the FanGraphs leaderboards. Rating Excellent Great Above Average Average Spd 7.0 6.0 5.5 4.5

Below Average Poor Awful


4.0 3.0 2.0

Offense Defense Pitching Win Probability Principles PITCHF/x WAR Business

GB%, LD%, FB%


Batted Ball Statistics are fairly straightforward: they express how many of a batters balls in play are line drives, ground balls, or fly balls. This includes balls that leave the park (home runs), so the sum of a batters batted ball statistics should be 100%. Major league ballplayers have a variety of swings, resulting in a large number of different batted ball profiles. Some batters hit lots of fly balls (typically power hitters), others put lots of balls on the ground (contact hitters), and many others fall somewhere in between. Infield pop-ups are also tracked on FanGraphs (IFFB%), and they are expressed as the percentage of pop-ups a batter hits out of their total number of fly balls. These numbers are generally small and fluctuate from year to year. Theyre the worst batted ball type for batters, as they are easy outs. Context: Please note that the following chart is meant as an estimate, and that league-average batted ball rates varies slightly on a year-by-year basis. To see the league-average batted ball breakdown for every year from 2002 to the present, check the FanGraphs leaderboards.
League Average LD GB FB IFFB 20% 44% 36% 10%

Power hitters will generally have higher fly ball rates (~44%), while contact hitters normally have high ground ball rates (50+%). And all hitters will hit their share of pop-ups. Things to Remember:

A line drive produces 1.26 runs per out, while fly balls produce 0.13 runs per out and groundballs produce 0.05 runs per out. In other words, batters want to hit lots of line drives and fly balls, while pitchers generally want to cause batters to hit groundballs. Players that dont hit many balls in the air (higher GB% with lower FB% and LD%) generally have higher BABIPs and batting averages, but have limited power. This data is tracked by Baseball Info. Solutions (BIS), which is why its only available for players back until 2002.

K% and BB%
Strikeout rate (K%) and walk rate (BB%) measure how often a position player walks or strikes out per plate appearance.* Theyre measured in percentage form, so its easy to compare between players and years. High walk rates are good for batters because it means theyre reaching base more often, while low walk rates are bad. Strikeout rates are a bit tougher to pin down while making outs is bad, striking out isnt necessarily worse than any other sort of out. If a player is still getting hits, walking, and reaching base at a high rate, then they can still be a valuable offensive piece with a high strikeout rate. *In the past, FanGraphs carried K% as K/AB. In 2011, K% was changed to K/PA. Context: Please note that this chart is meant as an estimate, and that league-average strikeout and walk rates vary on a year-by-year basis. To see the league-average strikeout and walk rate for every year from 1901 to the present, check the FanGraphs leaderboards. Rating Excellent Great Above Average Average Below Average Poor Awful Things to Remember: Power hitters tend to have high strikeout and walk rates, since they may swing and miss often, yet are pitched around by pitchers. Contact hitters are the opposite; they tend to have low strikeout and walk rates. The more a player strikes out, the tougher it is for them to maintain a high batting average since they are putting fewer balls in play. K% 10.0% 12.5% 15.0% 18.5% 20.0% 25.0% 27.5% BB% 15.0% 12.5% 10.0% 8.5% 7.0% 5.5% 4.0%

Plate Discipline (O-Swing%, Z-Swing%, etc.)


Plate Discipline statistics tell us how patient a player is at the plate, and how good they are at making contact with pitches. There are a wide number of these stats: O-Swing%: The percentage of pitches a batter swings at outside the strike zone. Z-Swing%: The percentage of pitches a batter swings at inside the strike zone. Swing%: The overall percentage of pitches a batter swings at. O-Contact%: The percentage of pitches a batter makes contact with outside the strike zone when swinging the bat. Z-Contact%: The percentage of pitches a batter makes contact with inside the strike zone when swinging the bat. Contact%: The overall percentage of a batter makes contact with when swinging the bat. Zone%: The overall percentage of pitches a batter sees inside the strike zone. F-Strike% The percentage of first pitch strikes. SwStr%: The percentage of total pitches a batter swings and misses on. FanGraphs carries plate discipline statistics based on both Baseball Info. Solutions (BIS) data as well as Pitch f/x data. There are some minor differences between the two systems. The Pitch f/x section presents raw Pitch f/x data broken down according to defined baselines, while the BIS section takes the Pitch F/x data and uses human scorers to modify classifications. Here are the batter plate discipline leaderboards according to BIS data, and here are the same leaderboards according to PITCHf/x data. Context: Please note that the following chart is meant as an estimate, and that league-average for all of these stats varies on a year-by-year basis. To see the league-average plate discipline stats for every year from 2002 to the present, check the FanGraphs leaderboards. Stat O-Swing Z-Swing Swing O-Contact Z-Contact Contact Zone F-Strike SwStr Things to Remember: These statistics are useful for evaluating hitters and pitchers, with SwStr% being especially important when looking at pitchers. Average 30% 65% 46% 68% 88% 81% 45% 59% 8.5%

Swinging strike percentage (SwStr% swinging strikes per pitch) should not be confused with whiff rate (swinging strikes per swing).

Pitch Type Linear Weights


The Pitch Type Linear Weights (Pitch Values) section on FanGraphs attempts to answer the question, Which pitch is a batter most successful against? The changes in run expectancy between an 0-0 count and a 0-1 or 1-0 count are obviously very small, but when added up over the course of the season, you can get an idea of which pitch a hitter is best against and usually drives. If they hit one pitch especially hard or they are less likely to chase on sliders, these successes will show up using Pitch Type Linear Weights. Also, if a hitter swings and misses on a specific pitch frequently, this problem will show up. Youll notice that there are two different types of Pitch Type Linear Weights: total runs by pitch (which is shown as wFB, wSL, wCB, etc.) and standardized runs by pitch (shown as wFB/C, wSL/C, wCB/C, etc.). The first category is the total runs above average that a hitter has contributed against that pitch. However, it is tough to compare these total numbers since hitters see different amounts of each pitch. The second category corrects for this, standardizing the values on a per 100 pitch basis. In other words, when you see wFB/C, that represents the average amount of runs that hitter produced against 100 fastballs thrown. Context: A score of zero is average, with negative scores being below average and positive scores being above average. In general, pitches will generally fall somewhere between +20 and -20 runs, with the most extreme pitches touching +/-30. On a per 100 pitch basis, the range shrinks to around +1.5 to 1.5 runs. Again, youll see some extreme scores on either end of the spectrum, but thats the range that most pitches and hitters fall into. Things to Remember: This stat has limited predictive power. It can show you what pitches a hitter has had success with in the past, but you should be careful in extrapolating those results and projecting the future. Its a descriptive statistic, not a predictive one. Beware of sample sizes! Pitches can get misclassified sometimes, and you should be careful to draw any conclusions from pitches that a hitter hasnt faced a large amount of. Pitch Type Linear Weights can also be used to evaluate pitchers, seeing which pitches they have had most success against in the past. In this case, the values are flipped; positive values are a good result for the pitcher, not the hitter.

Pitch Type Linear Weights Explained


by Dave Allen - May 21, 2009

Yesterday David Appleman announced a new section at FanGraphs showing the Linear Weights Run Value for each pitcher and pitch type. He asked me to write a short explanation of how these values are calculated. The run value of any event is the change in the expected number of runs scored over the rest of the inning from before and after the event happened. The expected number of runs scored is the average number scored from a given out and base-occupancy state. Lets take Tuesdays Oakland at Tampa Bay game as an example. At the top of the 1st inning with zero on and zero out the average team scores 0.55 runs. Orlando Cabrera hit a single off of Jamie Shields. Now with a runner on first and none out the average team scores 0.95 runs. So the run value of the single was 0.55-0.95=0.4. You can do the same thing taking each pitch as as event rather than the outcome of each atbat. To do this you need to know the run expectancy from each count, in other words, the average run value of all events from at-bats which pass through a given count. For example the run expectancy of a 3-0 count is 0.2, on average at-bats from that count are worth about half of a single. Now we can run through Cabreras at-bat with Shields as an example of valuing each pitch in an at-bat. 0-0: Run Value 0.00 Pitch 1: Fastball for a ball 1-0: Run Value 0.03 So the value of that first pitch was 0.03 runs. On average the As will score 0.03 more runs than before Shields threw the pitch. 1-0: Run Value 0.03 Pitch 2: Fastball for a called strike 1-1: Run Value -0.02 The run value of that fastball was -0.02-0.03 =-0.05. 1-1: Run Value -0.02 Pitch 3: Fastball for a called strike 1-2: Run Value -0.08 The run value of that fastball was -0.06. You can see here that the run value of a strike (or any other event) is count-dependent. 1-2: Run Value -0.08 Pitch 4: Fastball for a ball

2-2: Run Value -0.04 The run value of that fastball was 0.04. 2-2: Run Value -0.04 Pitch 5: Change fouled off 2-2: Run Value -0.04 Since the count did not change the run expectancy did not, so the run value of this changeup was 0. 2-2: Run Value -0.04 Pitch 6: Fastball hit for a single Runner on first no outs: Run Value 0.4 The run value of this fastball is the change in run expectancy, 0.4-(-0.04) = 0.44. Shields threw five fastballs valued at 0.03, -0.05,-0.06,0.04 and 0.44. He threw one changeup that had a value of 0.00. These values are the change in run expectancy in the game, so a negative number is a good for the pitcher (fewer runs scored). On the player pages the numbers are flipped so a positive number indicates a good pitch, the number of runs saved by those pitches.

Offense Defense Pitching Win Probability Principles PITCHF/x WAR Business

DRS
Defensive Runs Saved (DRS) is a defensive statistic calculated by The Fielding Bible, an organization run by John Dewan, that rates individual players as above or below average on defense. Much like UZR, players as measured in runs above or below average, and Baseball Info Solutions data is used as an input. The full explanation of how DRS is calculated is a tad complicated see this FAQ page for more detailed information but in simple terms: as I understand it, the numbers determines (using film study and computer comparisons) how many more or fewer successful plays a defensive player will make than league average. For instance, if a shortstop makes a play that only 24% of shortstops make, he will get .76 of a point (1 full point minus .24). If a shortstop BLOWS a play that 82% of shortstops make, then

you subtract .82 of a point. And at the end, you add it all up and get a plus/minus. (Joe Posnanski, Sports Illustrated) FanGraphs reports a large number of fielding calculations using this system, all of them measured in runs above average. Descriptions come from the Fielding Bible website: rSB Stolen Base Runs Saved (Catchers/Pitchers) measures two things: the pitchers contributions to controlling the running game, and gives the catcher credit for throwing out runners and preventing them from attempting steals in the first place. rBU Bunt Runs Saved (1B/3B) evaluates a fielders handling of bunted balls in play. rGDP Double Play Runs Saved (2B/SS) credits infielders for turning double plays as opposed to getting one out on the play. rARM Outfield Arms Runs Saved evaluates an outfielders throwing arm based on how often runner advance on base hits and are thrown out trying to take extra bases. rHR HR Saving Catch Runs Saved credits the outfielder 1.6 runs per robbed home run. rPM Plus Minus Runs Saved evaluates the fielders range and ability to convert a batted ball to an out. DRS Total Defensive Runs Saved indicates how many runs a player saved or hurt his team in the field compared to the average player at his position. To reiterate, Defensive Runs Saved (DRS) captures a players total defensive value. Context: Before drawing any conclusions about a players defense, look at a full three years of defensive data, drop the decimal points and take an average, and compare DRS scores with other defensive metrics (UZR, TZL, etc.). By taking a broader picture, you will help ensure that youre not being over-confident or overstating a players defensive abilities. DRS scores can be broken down into the same general tiers as UZR:
Defensive Ability Gold Glove Caliber Great Above Average Average Below Average Poor Awful DRS +15 +10 +5 0 -5 -10 -15

Things to Remember:

Looking for even more information on how DRS is calculated? Head over to the Fielding Bible, where you can find an extensive article that explains their process in detail. DRS uses Baseball Info Solutions (BIS) data in calculating its results. Its important to note that this data is compiled by human scorers, which means that it likely includes some human error. Until FIELDF/x data gets released to the public, we are never going to have wholly accurate defensive data; human error is impossible to avoid when recording fielding locations by hand, no matter how meticulous the scorers. That said, BIS data is still the best, most accurate defensive data available at this time, so just be careful not to overstate claims of a players defensive prowess based solely on defensive stats. DRS is comparable to UZR in terms of methodology (e.g. the use of zones for evaluating defensive success rates) and results. There are some slight differences between the two systems (see below), so DRS and UZR will occasionally disagree on how to rate certain players, but they agree more often than they disagree. The differences between the two systems are smaller than they seem at first glance: Both systems have the same goal- estimate a players defensive worth in units of runs, and both rely on hit location and type data from Baseball Info Solutions. The differences lie in the various adjustments and calculations that are made. For example, Defensive Runs Saved uses a rolling one-year basis for the Plus/Minus system, while Lichtman uses several years of data to determine each plays difficulty level. Defensive Runs Saved also includes components to measure pitcher and catcher defense. (The Fielding Bible)

RZR
Revised Zone Rating (RZR) measures, the proportion of balls hit into a fielders zone that he successfully converted into an out (Hardball Times). Invented by John Dewan and displayed on the Hardball Times for a number of years, it has fallen out of fashion over the years as more advanced, accurate measures have become available (like UZR and DRS), but it still works as an introductory defensive statistic to show saber-newbies how advanced defensive statistics work. RZR measures a players range, taking three things into account: the amount of Balls In Zone (BIZ) a player receives, a players total Plays Made, and a players total amount of Out Of Zone Plays Made (OOZ). It is, essentially, a simplified version of UZR. For more information on how the zones are classified, check out the detailed explanation over at The Hardball Times. Context: Please note that the following chart is meant as an estimate, and that league-average RZR varies on a year-by-year basis. To see the league-average RZR for every year from 2002 to the present, check the FanGraphs leaderboards. Rating RZR

Excellent Great Above Average Average Below Average Poor Awful

0.940 0.900 0.860 0.835 0.800 0.750 0.700

Catcher Defense
Evaluating catcher defense has long been one of the banes of saberists everywhere. While there have been some strides in evaluating defense for position players see: UZR, DRS catchers are a separate world in and of themselves. There are a number of different defensive skills that catchers need to possess, and each of them has the potential ability to impact their overall value defensive: arm strength and accuracy, pitch blocking ability, pitch framing ability, and pitch selection. At the moment, saberists have focused their research on the first three factors, as pitch selection is a tricky thing to isolate and does not reside solely in a catchers purview. You can currently find two the first two factors on FanGraphs: arm strength and accuracy, and pitch blocking ability. rSB: Calculated by The Fielding Bible, Stolen Base Runs Saved measures how many runs a catcher contributes to their team by throwing out runners and preventing runners from attempting steals in the first place. RPP: First calculated by Bojan Koprivica, Passed Pitch Runs (RPP) calculates the number of runs above / below average a catcher is at blocking pitches. Both of these values are then added together, and account for the Fielding component in WAR for catchers. Context: Like all defensive stats, both rSB and RPP are centered around 0, meaning that a score of zero is considered league average. Scores above zero are good, and those below zero are bad. Rating Excellent Great Above Average Average Below Average Poor Awful rSB, RPP +5 +3 +1 0 -1 -3 -5

Note: these values are for each stat separately. If you would like tiers for catcher defensive value in total, double the values of the above tiers in order to get a quick-and-dirty estimate.

Things to Remember: While FanGraphs does not yet account for pitch framing, Mike Fast has made some remarkable strides measuring a catchers pitch framing ability. If youre interested, be sure to check out his research. Next to RPP, you will also see the stat Calculated Passed Pitches (CPP). This is another statistic derived from Bojans research, and it measures how many passed balls a catcher should have allowed based on his pitches seen.

Offense Defense Pitching Win Probability Principles PITCHF/x WAR Business

TZ / TZL
While UZR and DRS are both based off of data collected by Baseball Info Solutions (BIS), Total Zone (TZ) is the lone defensive statistic calculated exclusively using play-by-play data available from Retrosheet. Invented by Sean Smith, its calculated in a variety of ways depending upon how much data is available in that specific year (details can be found here), but since it only requires play-by-play data, TZ scores can be calculated for any player in baseball history. As a result, TZ is used in historical WAR scores on Baseball-Reference.com and FanGraphs (pre-2002). Total Zone with Location Data (TZL) is an improved version of TZ that Sean Smith developed in 2010. You can read about all its details here, but in short, it uses Gameday hit location data to make its calculations more accurate. Context: Like UZR and DRS, TotalZone and TotalZone with Location Data are both presented as Runs Saved. League-average is zero, while positive scores represent above-average fielding and negative scores denote below-average fielding.
Defensive Ability Gold Glove Caliber Great Above Average Average TZ/TZL +15 +10 +5 0

Below Average Poor Awful

-5 -10 -15

Things to Remember: TZ and TZL are both good metrics, although UZR and DRS are still considered the more accurate fielding metrics. However, UZR and DRS can only be calculated for modern-day players due to technology constraints, so TZ is the best historical fielding statistic available. If you want to compare a modern day players fielding with a player from the 1940s, TZ would be the statistic to use.

Offense Defense Pitching Win Probability Principles PITCHF/x WAR Business

FSR
The Fan Scouting Report is a yearly project conducted by Tom Tango that rates players on their defensive ability based on fan observations and voting. Fans are asked to rate players on a 0-100 scale (with 100 being the best and 0 being the worst) in a number of different categories: Instinct, Speed, Hands, Arm Strength and Accuracy, First Step, etc. These raw ratings are presented on the FanGraphs leaderboards and on player pages. These ratings are also compiled and converted into one overall stat, FSR, that measures a players total defensive ability in runs above or below average. This statistic is presented in the same scale as UZR and DRS, meaning it can be compared directly with these two statistics to provide more context on a players defense. Obviously, the Fan Scouting Report is based on subjective ratings, but the idea is that fans watch their team on a daily basis and are quite knowledgeable about their teams players. When taken in a large enough group, fans can theoretically provide an accurate measure of a players overall defensive ability and ranking. Context: Just like with UZR and DRS, the Fan Scouting Report rates players on a runs scale:
Defensive Ability Gold Glove Caliber FSR +15

Great Above Average Average Below Average Poor Awful

+10 +5 0 -5 -10 -15

Things to Remember: The accuracy of the Fan Scouting Report is debatable, as fans are not professional scouts, have their own inherent biases, and are limited to what they can see on television. However, as long as you realize these limitations and do not use FSR as the be-all-end-all defensive statistic (always use it in conjunction with UZR and DRS), it works fine.

Offense Defense Pitching Win Probability Principles PITCHF/x WAR Business

UZR
Ultimate Zone Rating (UZR) is one of the most widely used, publicly available defensive statistics. The theory behind UZR is tougher to intuitively grasp than Defensive Runs Saved (DRS), but the simplified version is that UZR puts a run value to defense, attempting to quantify how many runs a player saved or gave up through their fielding prowess (or lack thereof). There are a couple different components to UZR, including: Outfield Arm Runs (ARM) The amount of runs above average an outfielder saves with their arm by preventing runners to advance. Double-Play Runs (DPR) The amount of runs above average an infielder is in turning double-plays. Range Runs (RngR) Is the player an Ozzie Smith or an Adam Dunn? Do they get to more balls than average or not? Error Runs (ErrR) Does the player commit more or fewer errors compared with a league-average player at their position? The run values in each of these categories are then compiled into one overall defensive score, UZR. Since UZR is measured in runs, it can be compared easily with a players offensive contributions (wRAA).

For the details on how UZR is calculated i.e. how we can attach a run value to defensive events see the FanGraphs UZR Primer. Context: Since defensive statistics are still relatively new, they should not be taken as 100% dead accurate. Before drawing any conclusions about a players defense, look at a full three years of defensive data, drop the decimal points and take an average, and compare UZR scores with other defensive metrics (DRS, TZL, etc.). By taking a broader picture, you will help ensure that youre not being over-confident or overstating a players defensive abilities. In general, UZR scores can be broken down into the following tiers. This is a good shorthand way of evaluating a players defensive ability level:
Defensive Ability Gold Glove Caliber Great Above Average Average Below Average Poor Awful UZR +15 +10 +5 0 -5 -10 -15

Things to Remember: Beware of sample sizes! If a player only spent 50 innings at a position last season, itd be a good idea not to draw too many conclusions from their UZR score over that time. Like with any defensive statistic, you should always use three years of UZR data before trying to draw any conclusions on the true talent level of a fielder. UZR uses Baseball Info Solutions (BIS) data in calculating its results. Its important to note that this data is compiled by human scorers, which means that it likely includes some human error. Until FIELDF/x data gets released to the public, we are never going to have wholly accurate defensive data; human error is impossible to avoid when recording fielding locations by hand, no matter how meticulous the scorers. That said, BIS data is still the best, most accurate defensive data available at this time, so just be careful not to overstate claims of a players defensive prowess based solely on defensive stats. Since UZR is a counting statistic like RBIs or HRs, the more playing time a player accrues, the higher (or lower) their UZR will be. In order to compare players with different amounts of playing time, UZR can be scaled on a 150 game basis (UZR/150). If you want to compare a player with 90 games played to someone with 140, UZR/150 would be the way to do so.

UZR is park-adjusted, meaning it adjusts for the fact that fielders have to deal with odd quirks in certain ballparks.

ERA
Earned Run Average (ERA) is a rudimentary metric designed to assess how well a pitcher has done in the past. To calculate, divide a pitchers total number of Earned Runs allowed by his total number of Innings Pitched and multiply by nine. ERA = (Earned Runs / Innings Pitched) * 9 ERA is not a good predictor of future success, as Earned Runs are dependent on multiple factors outside of the pitchers control: defense, umpiring, the judgment of a scorekeeper, etc. Pitchers arent held accountable for runs scored as a result of an Error by one of their fielders, but they are held accountable for runs scored on bloop hits that get by slow or poor defenders. If a pitcher has a poor defense behind him, he will likely end up with a higher ERA than he should have. As a result, a pitchers ERA is a poor estimate of their true talent level. Consider this scenario. In your mind, who is the better player: the pitcher that strikes out the side, or the pitcher that relies on his rangy defense to haul in three deep fly balls? ERA thinks that both pitchers are just as good, but thats not telling you the full picture. Context: Please note that the following chart is meant as an estimate, and that league-average ERA varies widely on a year-by-year basis. To see the league-average ERA for every year from 1901 to the present, check the FanGraphs leaderboards. Rating Excellent Great Above Average Average Below Average Poor Awful Things to Remember: ERA is difficult to compare across teams due to differences in team defenses, difficult to compare across leagues due to competition imbalance and the DH, and difficult to compare across years because of different run-scoring environments. A 3.50 ERA has a different meaning depending on if that pitcher played in the early 2000s or the Dead Ball Era, or if they played in a pitchers or hitters park. To adjust for park and league effects, check out ERA-. Its still not a perfect statistic, but it does make it easier to compare pitchers from different time periods and parks. ERA 2.90 3.25 3.75 4.00 4.20 4.50 5.00

WHIP
Walks plus Hits per Innings Pitched (WHIP) measures how many base runners a pitcher lets up. Its formula is exactly as it sounds: walks plus hits, all divided by innings pitched. Its a common statistic in fantasy baseball leagues, but its a poor evaluative tool since pitchers have little control over their Batting Average on Balls in Play (BABIP) rate. If you want to get serious about evaluating a pitchers ability level at preventing baserunners, look instead at their walk rate, BABIP, and batted ball profile. Pitchers do control their walk rates, but their hit rates reside largely outside their control and are prone to fluctuations. Context: Great pitchers will typically have lower WHIPs, and questionable pitchers will have higher WHIPs. However, since hit rates vary depending upon a pitchers BABIP, WHIP values are also prone to fluctuating. Please note that the following chart is meant as an estimate, and that league-average WHIP varies on a year-by-year basis. To see the league-average WHIP for every year from 1901 to the present, check the FanGraphs leaderboards. Rating Excellent Great Above Average Average Below Average Poor Awful WHIP 1.00 1.10 1.25 1.32 1.40 1.50 1.60

FIP
Fielding Independent Pitching (FIP) measures what a players ERA should have looked like over a given time period, assuming that performance on balls in play and timing were league average. Back in the early 2000s, research by Voros McCracken revealed that the amount of balls that fall in for hits against pitchers do not correlate well across seasons. In other words, pitchers have little control over balls in play. McCracken outlined a better way to assess a pitchers talent level by looking at results a pitcher can control: strikeouts, walks, hit by pitches, and homeruns. A walk is not as harmful as a homerun and a strikeout has less impact than both. FIP accounts for these kinds of differences, presenting the results on the same scale as ERA. It has been shown to be more effective than ERA in terms of predicting future performance and has become a mainstay in sabermetric analysis. For those curious, heres the formula for FIP: FIP = ((13*HR)+(3*(BB+HBP))-(2*K))/IP + constant

The constant is solely to bring FIP onto an ERA scale and is generally around 3.20. You can find historical FIP constant values here, or you can derive the constant by taking leagueaverage FIP and subtracting that from league-average ERA. Context: Please note that the following chart is meant as an estimate, and that league-average FIP varies on a year-by-year basis so that it is always the same as league-average ERA. To see the league-average FIP for every year from 1901 to the present, check the FanGraphs leaderboards. Rating Excellent Great Above Average Average Below Average Poor Awful Things to Remember: Voros McCrackens research was called Defense Independent Pitching Theory (DIPs Theory). Its the building block of much of todays pitching analysis. It can be a tricky concept to understand and counter-intuitive for most baseball fans. Refer to our sections on DIPs, BABIP, and Luck for more information. FIP does a better job of predicting the future than measuring the present, as there can be a lot of fluctuation in small samples. It is less effective in describing a pitchers single game performance and is more appropriate in a seasons worth of innings. FIP 2.90 3.25 3.75 4.00 4.20 4.50 5.00

xFIP
Expected Fielding Independent Pitching (xFIP) is a regressed version of FIP, developed by Dave Studeman from The Hardball Times. Its calculated in the same way as FIP, except it replaces a pitchers home run total with an estimate of how many home runs they should have allowed. This estimate is calculated by taking the league-average home run to fly ball rate (~9-10% depending on the year) and multiplying it by a pitchers fly ball rate. Home run rates are generally unstable over time and fluctuate around league-average, so by estimating a pitchers home run total, xFIP attempts to isolate a players ability level. A pitcher may allow home runs on 12% of their flyballs one year, then turn around and only allow 7% the next year. HR/FB ratios can be very difficult to predict, so xFIP attempts to correct for that. Here is the full formula for xFIP. Notice how it is almost exactly the same as the formula for FIP, with the lone difference being how each accounts for home runs: xFIP = ((13*(Flyballs * League-average HR/FB rate))+(3*(BB+HBP))-(2*K))/IP + constant

The constant is solely to bring FIP onto an ERA scale and is generally around 3.20. You can find historical FIP constant values here, or you can derive the constant by taking leagueaverage FIP and subtracting that from league-average ERA. League-average home run per fly ball rate varies on a yearly basis, but you can find those values here on the FanGraphs leaderboards. Along with FIP, xFIP is one of the best metrics at predicting a pitchers future performance. Since it was created, though, there have been some studies that suggest certain pitchers can post lower-than-average HR/FB rates over time. For more information on this, see the statistic SIERA. Context: Please note that the following chart is meant as an estimate, and that league-average xFIP varies on a year-by-year basis so that it is always the same as league-average ERA. To see the league-average xFIP for every year from 1901 to the present, check the FanGraphs leaderboards. Rating Excellent Great Above Average Average Below Average Poor Awful Things to Remember: While HR/FB ratios are generally unstable over time, some pitchers are still more prone to allowing home runs than others. If a pitcher has a long history of over- or under-performing the league average with their HR/FB rate, then you can reasonably expect them to perform closer to their career average than the league-average. In cases like this, xFIP may overestimate or underestimate a players true talent level by assuming a league average HR/FB ratio. Again, for more, see SIERA. Ground ball pitchers typically have higher HR/FB ratios than fly ball pitchers. xFIP has one of the highest correlations with future ERA of all the pitching metrics. Only SIERA out-paces it. xFIP 2.90 3.25 3.75 4.00 4.20 4.50 5.00

SIERA
Skill-Interactive ERA (SIERA) is the newest in a long line of ERA estimators. Like its predecessors FIP and xFIP, SIERA attempts to answer the question: what is the underlying skill level of this pitcher? How well did they actually pitch over the past year? Should their ERA have been higher, lower, or was it about right?

But while FIP and xFIP largely ignore balls in play they focus on strikeouts, walks, and homeruns instead SIERA adds in complexity in an attempt to more accurately model what makes a pitcher successful. SIERA doesnt ignore balls in play, but attempts to explain why certain pitchers are more successful at limiting hits and preventing runs. This is the strength of SIERA; while it is only slightly more predictive than xFIP, SIERA tells us more about the how and why of pitching. Heres what SIERA tells us: Strikeouts are goodeven better than FIP suggests. High strikeout pitchers generate weaker contact, which means they allow fewer hits (AKA have lower BABIPs) and have lower homerun rates. The same can be said of relievers, as they enter the game for a short period of time and pitch with more intensity. Also, high strikeout pitchers can increase their groundball rate in double play situations. Situational pitching is a skill for pitchers with dominant stuff. Walks are badbut not that bad if you dont allow many of them. Walks dont hurt lowwalk pitcher nearly as much as they hurt other pitchers, since low-walk pitchers can limit further baserunners. Similarly, if a pitcher allows a large amount of baserunners, they are more likely to allow a high percentage of those baserunners to score. Balls in play are complicated. In general, groundballs go for hits more often than flyballs (although they dont result in extra base hits as often). But the higher a pitchers groundball rate, the easier it is for their defense to turn those ground balls into outs. In other words, a pitcher with a 55% groundball rate will have a lower BABIP on grounders than a pitcher with a 45% groundball rate. And if a pitcher walks a large number of batters and also has a high groundball rate, their double-play rate will be higher as well. As for flyballs, pitchers with a high flyball rate will have a lower Homerun Per Flyball rate than other pitchers. Finally we have a stat that A) is accurate and predictive, and B) accounts for some of the complexity of pitching. Context: SIERA is on a similar scale to ERA, so any score that is a good ERA is also a good SIERA. Please note that the following chart is meant as an estimate, and that league-average SIERA varies on a year-by-year basis. To see the league-average ERA for every year from 2002 to the present, check the FanGraphs leaderboards. Rating Excellent Great Above Average Average Below Average Poor SIERA 2.90 3.25 3.75 3.90 4.20 4.50

Awful

5.00

In general, relief pitchers have lower SIERA scores than starting pitchers. As a handy shortcut, a pitcher that switches from starting to the bullpen will on average see their SIERA drop by 0.37 points (and vice versa). Things To Remember: Interested in calculating SIERA yourself? Good luck. But if you want to try, heres Matt Swartzs formula and explanation. As always, when evaluating pitchers, its best to use multiple statistics instead of relying on one alone. While SIERA is the most accurate of the ERA-estimators, its only slightly more accurate than xFIP. Both xFIP and FIP still have their uses, so I wouldnt recommend ditching them entirely and using only SIERA a balanced approach is always best. You can learn a lot about a pitcher by looking at which metrics like and dislike them, and for what reasons. In and of itself, SIERA works as well as many projection systems in terms of predicting a players future ERA. But be careful of this distinction: SIERA is technically a backwardlooking ERA estimator and not a forward-looking projection system. If you want to turn SIERA from an estimator and into a projection, you can follow the general formula laid out by Matt Swartz in this piece. SIERA is park-adjusted, meaning it adjusts for the fact that some pitchers play in PETCO Park and others in Yankee Stadium. SIERA is updated for the new (low-scoring) run environment around the majors.

tERA
True Runs Allowed (tERA) is a defense-independent ERA estimator built by Graham MacAree from StatCorner that was designed as an alternative to FIP and xFIP. The most common complaint about FIP and xFIP is that they completely ignore performance on balls in play, while batted balls can still tell us something about a pitchers skill level: groundballs are good (since they normally result in outs), flyballs have a higher probability of resulting in extra basehits, pop-ups are almost guaranteed outs, and line drives are the most likely type of ball in play to end up as a hit. tERA includes all of these variables, and is based on the same scale as ERA, FIP, and xFIP. It is a little less accurate in predicting future performance than xFIP, but it is still more valuable than ERA and provides us with another lens through which to evaluate pitchers. Now that SIERA is carried at FanGraphs, tERA is somewhat redundant. SIERA also includes batted ball information, and does so in a way that makes it a more accurate and predictive ERA estimator than tERA (albeit slightly). Thats not to say that tERA doesnt have its uses, but if you have to pick one, use SIERA. Context:

Please note that the following chart is meant as an estimate, and that league-average tERA varies on a year-by-year basis. To see the league-average tERA for every year from 2002 to the present, check the FanGraphs leaderboards. Rating Excellent Great Above Average Average Below Average Poor Awful Things to Remember: The original form of this statistic is tRA, which is calculated on a Runs Allowed scale as opposed to an Earned Runs Allowed scale. This makes tRA less easy to compare with ERA, FIP, and xFIP, but some analysts still prefer this style because it excludes the subjectivity inherent in Earned and Unearned runs. You can find tRA available over at StatCorner. Also at StatCorner, you can find tRA*, which is a regressed version of tRA. It uses a players past performance and league-average rates to predict how a player should perform in the future. It may listed in some locations as tRAr. Batted ball classifications should not be treated as 100% accurate, since they rely upon a human scorer saying, Yes, that looks like a line drive and not a fly ball. This is a small caveat to keep in mind when using tERA or any statistics that rely on batted ball data. tERA 3.20 3.50 4.00 4.20 4.50 5.00 5.50

Strikeout and Walk Rates


Strikeout Rate (K/9) and Walk Rate (BB/9) are rate statistics that measure how many strikeouts and walks a pitcher averages over nine innings. Of course, not many pitchers throw nine innings all at once anymore, but this is a way of standardizing the statistic so its on an easy-to-understand scale for most fans. For pitchers, more strikeouts are obviously better, while walks are bad. Good pitchers normally strike out at least twice as many batters as they walk. Some people prefer to measure pitcher strikeout and walk rates by using K% and BB%, since those account for the fact that some pitchers face more batters per inning than other pitchers (therefore, getting more chance to strike them out or walk them). Both K% and BB% are calculated using plate appearances in the denominator, not at bats. Context: Rating Excellent Great Above Average K/9 10.0 8.5 7.5 K% 25.0% 22.5% 20.0%

Average Below Average Poor Awful

7.1 6.0 5.0 4.5

18.5% 15.0% 12.5% 10.0%

Please note that these charts are meant as estimates, and that league-average strikeout and walk rates vary on a year-by-year basis. To see the league-average strikeout and walk rate for every year from 1901 to the present, check the FanGraphs leaderboards. Rating Excellent Great Above Average Average Below Average Poor Awful BB/9 1.5 2.3 2.8 3.3 4.0 4.5 5.0 BB% 4.0% 6.0% 7.0% 8.5% 9.0% 9.5% 10.0%

GB%, LD%, FB%


Batted Ball Data (ground ball, fly-ball, and line-drive rates) presents the percentage of each batted ball type hit against a pitcher. Much like how hitters have partial control over their batted ball splits, pitchers do have some control over the way the ball is put into play against them. Depending on their pitching philosophy, pitchers can tend to be primarily ground ball or fly ball pitchers. Pitchers with high ground ball rates tend to give up more total hits, but they also allow fewer extra base hits. This is relatively intuitive: ground balls are harder to field than fly balls and they rarely go for extra bases (and almost never go for home runs). So pitchers who limit the amount of fly balls hit will also limit the amount of extra bases against them. Similarly, fly ball pitchers tend to allow fewer total hits, but more extra base hits. There are a few other interesting side effects to pitchers have extreme batted ball profiles. This is taken from the SIERA page, as SIERA uses batted ball data in its formula: In general, ground balls go for hits more often than fly balls (although they dont result in extra base hits as often). But the higher a pitchers ground ball rate, the easier it is for their defense to turn those ground balls into outs. In other words, a pitcher with a 55% ground ball rate will have a lower BABIP on grounders than a pitcher with a 45% ground ball rate. And if a pitcher walks a large number of batters and also has a high ground ball rate, their doubleplay rate will be higher as well. As for fly balls, pitchers with a high fly ball rate will have a lower Home Run Per Fly Ball rate than other pitchers.

Context: Please note that the following chart is meant as an estimate, and that league-average batted ball rates varies slightly on a year-by-year basis. To see the league-average batted ball breakdown for every year from 2002 to the present, check the FanGraphs leaderboards. League Average 20% 44% 36% 10%

LD GB FB IFFB*

Ground ball pitchers generally have grounder rates over 50%, while fly ball pitchers have fly ball rates above (or approaching) 40%. *Infield pop-ups are also tracked on FanGraphs (IFFB%), and they are expressed as the percentage of pop-ups a batter hits out of their total number of fly balls. Things to Remember: Line drives are death to pitchers, while ground balls are the best for a pitcher. In numerical terms, line drives produce 1.26 runs/out, fly balls produce 0.13 R/O, and ground balls produce only 0.05 R/O. This data is tracked by Baseball Info. Solutions (BIS), which is why its only available for players back until 2002.

HR/FB
Home Run to Fly Ball rate (HR/FB) is the ratio of how many home runs are hit against a pitcher for every fly ball they allow. Home runs are obviously not good for a pitcher, and a pitcher can reduce the number of home runs hit against them in two ways: by increasing their ground ball rate (therefore lowering their fly ball rate), or by reducing their HR/FB ratio. While pitchers can control (to a certain extent) the type of batted balls hit against them, there is less skill involved when considering whether a long fly ball is hit into the seats or to the warning track. For example, pitchers who throw in a home ballpark with short fences will tend to have a higher HR/FB ratio than pitchers who throw in large ballparks. Pitcher HR/FB ratios have also been shown to vary considerably from year to year, meaning they have limited predictive value. Context: Please note that the following chart is meant as an estimate, and that league-average HR/FB rate varies on a year-by-year basis. To see the league-average HR/FB rate for every year from 2002 to the present, check the FanGraphs leaderboards. Rating HR/FB

Excellent Great Above Average Average Below Average Poor Awful

5.0% 7.0% 8.5% 9.5% 10.5% 11.5% 13.0%

Remember, extreme home run rates in either direction are likely unsustainable. Certain pitchers can consistently post lower than average home run rates, though, so if trying to determine if a pitchers HR/FB rate is unsustainable, be sure to also compare it to their career rate. Things to Remember: Taking a glance at a pitchers HR/FB ratio can help tell you if a player had an over- or under-inflated ERA. Pitchers with HR/FB ratios much higher or lower than league average will normally regress towards league average in the future, which will have a corresponding effect on their ERAs and FIPs. One limitation of the HR/FB ratio is that home runs can also come off of line drives. Generally speaking though, the main principles and implications of a pitchers HR/FB ratio remain the same.

BABIP
Batting Average on Balls In Play (BABIP) measures how many balls in play against a pitcher go for hits. While typically around 30% of all balls in play fall for hits, there are three main variables that can affect BABIP rates for individual players: a) Defense - Say a player cracks a hard line drive down the third base line. If an elite fielder is playing at third, they may make a play on it and throw the runner out. However, if theres a dud over there with limited range, the ball could just as easily fly by for a hit. Pitchers have no control over the defenses behind them; all they can do is minimally affect if a ball is more likely to be in the air or on the ground. b) Luck - Sometimes, even against a great defense, bloop hits can fall in. A batter may turn a nasty pitch into a dribbler that just sneaks past the first baseman, or they may blast a shot in the gap that a fielder makes a diving catch on. A pitcher can make the absolute perfect pitch against a batter, yet the hitter could still dribble it up the middle for a hit. Thats just the game. c) Changes in Talent Level - Over the course of a season, players can go through periods of adjustment. Maybe a pitcher starts tipping one of their pitchers, their mechanics are off, or they start leaving too many balls over the plate. Balls get hit harder until the pitcher makes the necessary adjustments, but until then, the harder a ball is struck, the more likely it is to fall in for a hit. Due to all these reasons, BABIP is inherently flaky and prone to variation, which can dramatically affect a pitchers hits allowed and ERA. If very few balls in play fall for hits, a

pitcher wont allow many runs to score and will have a very low ERA; similarly, if too many balls fall in for hits, a pitchers ERA can skyrockets upward. If a pitcher has a very high or very low BABIP, it means that whatever the reason for the spike (whether its defense, luck, or slight skill changes), that player will regress back toward their career BABIP rate. BABIP rates are flaky and prone to vary wildly from year to year, so we should always take any extreme BABIP rates with a grain of salt. Context: The average BABIP for pitchers is around .290 to .300, and pitchers have much less control over their BABIP than batters. To what degree pitchers can influence their BABIPs is still up for debate, but it has been proven that pitchers with high strikeout rates tend to generate weaker contact and, therefore, allow fewer hits on balls in play. The same is generally true of relievers, as they can dial up the intensity over shorter outings. Even then, though, high strikeout pitchers still have career BABIPs just slightly lower than the typical .290 to .300 range. Think Justin Verlander (.285 career BABIP) or Clayton Kershaw (.279 career BABIP). And if any pitcher is posting an extremely deviant BABIP, expect them to regress toward league average. Things to Remember: Saying a player will regress to league average is a tricky subject. See regression for more info. Groundball pitchers have a lower BABIP on groundballs than other pitchers. In other words, if a pitcher has an extreme groundball rate, they also tend to be extra good at making sure those grounders are hit weakly and turn into out.

LOB%
Left on Base Percentage (LOB%) measures the percentage of base runners that a pitcher strands on base over the course of a season. This stat does not use the left on base numbers reported in box scores, but instead is calculated using a pitchers actual hits, walks, and runs allowed results: LOB% = (H+BB+HBP-R)/(H+BB+HBP-(1.4*HR)) Most pitchers have LOB%s around league average (which is approximately 70-72%, depending upon the season), and pitchers that deviate from that average tend to see their numbers regress towards average in the future. In other words, if you see a pitcher with a 60 LOB%, they are letting lots of runners score so their ERA will be high, but the odds are that they will strand more runners in the future and lower their ERA. Not all pitchers will regress toward league-average, though: high strikeout pitchers have been shown to have some control over their LOB%. Pitchers that record a high numbers of strikeouts can pitch their way out of jams more easily than pitchers that rely upon their teams

defense, so they are able to maintain LOB%s higher than league average. Also, if a pitcher isnt a major-league caliber starter or if theyre a borderline case its likely that their true-talent LOB% is below league average. By using this statistic in conjunction with others specifically BABIP and HR/FB its possible to get an idea of if a pitcher is under- or over-performing and likely to regress. For more details, see the two videos below: Context: Please note that the following chart is meant as an estimate, and that league-average LOB% varies on a year-by-year basis. To see the league-average LOB% for every year from 1901 to the present, check the FanGraphs leaderboards. Rating Excellent Great Above Average Average Below Average Poor Awful LOB% 80% 78% 75% 72% 70% 65% 60%

ERA- / FIP- / xFIPERA Minus, FIP Minus, and xFIP Minus are the pitching version of OPS+ and wRC+: a simple way to tell how well a player performed in relation to league average. All of these statistics have a similar scale, where 100 is league average and each point above or below 100 represents a percent above or below league average. However, as lower is better for (almost) all pitching stats, a lower ERA- or FIP- is better. For example, Josh Johnson led the majors in 2010 with a 60 FIP-, meaning his FIP was 40% better than league average. On the flip side of the spectrum, Dave Bush had a 130 FIP- that season, meaning his FIP was 30% worse than league average. If the acronyms seem funky, realize that you dont have to use them in order to get a point across. Such-and-such has had an xFIP 20% better than league average or So-and-so has performed 10% worse than league average, according to fielding independent metrics. These statistics are all park and league adjusted, so they account for the fact that some pitchers throw in Fenway Park while others throw in PETCO Park. These adjustments also make it possible to compare pitchers between years and time periods. For information on how these stats are calculated, see this article at Walks Like A Sabermatrician. In that article, ERA- is described as aERA.

SD / MD

Shutdowns (SD) and Meltdowns (MD) are two relatively new statistics, created as an alternative to Saves in an effort to better represent a relief pitchers value. While there are some odd, complicated rules surrounding when a pitcher gets a save, Shutdowns and Meltdowns strip away these complications and answer a simple question: did a relief pitcher help or hinder his teams chances of winning a game? If they improved their teams chances of winning, they get a Shutdown. If they instead made their team more likely to lose, they get a Meltdown. Intuitive, no? Using Win Probability Added (WPA), its easy to tell exactly how much a specific player contributed to their teams odds of winning on a game-by-game basis. In short, if a player increased his teams win probability by 6% (0.06 WPA), then they get a Shutdown. If a player made his team 6% more likely to lose (-0.06), they get a Meltdown. Shutdowns and meltdowns correlate very well with saves and blown saves; in other words, dominant relievers are going to rack up both saves and shutdowns, while bad relievers will accrue meltdowns and blown saves. But shutdowns and meltdowns improve upon SVs/BSVs by giving equal weight to middle relievers, showing how they can affect a game just as much as a closer can, and by capturing more negative reliever performances. Context: The +/- 6% cutoff puts SDs and MDs on a similar scale as saves and holds, meaning 40 shutdowns is roughly as impressive as 40 saves or 40 holds. Dominant closers or set-up men will typically have 35 to 40+ shutdowns and a handful of meltdowns. Meanwhile, meltdowns are more common than blown saves, and they can happen to both closers and non-closers alike. The worst relievers will rack up around 10 to 15 meltdowns in a season. Rating Excellent Great Above Average Average Below Average Poor Awful SD 40 35 25 20 15 10 5 MD 2 4 6 8 10 12 15

Plate Discipline (O-Swing%, Z-Swing%, etc.)


Plate Discipline statistics tell us a wide number of things about pitchers: how effective they are at attacking the zone, how often they get hitters to chase on pitches (both inside and outside the zone), how often they throw a first-pitch strike, etc. There are a wide number of these stats: O-Swing%: The percentage of pitches a batter swings at outside the strike zone. Z-Swing%: The percentage of pitches a batter swings at inside the strike zone. Swing%: The overall percentage of pitches a batter swings at. O-Contact%: The percentage of pitches a batter makes contact with outside the strike zone

when swinging the bat. Z-Contact%: The percentage of pitches a batter makes contact with inside the strike zone when swinging the bat. Contact%: The overall percentage of a batter makes contact with when swinging the bat. Zone%: The overall percentage of pitches a batter sees inside the strike zone. F-Strike% The percentage of first pitch strikes. SwStr%: The percentage of total pitches a batter swings and misses on. FanGraphs carries plate discipline statistics based on both Baseball Info. Solutions (BIS) data as well as Pitch f/x data. There are some minor differences between the two systems. The Pitch f/x section presents raw Pitch f/x data broken down according to defined baselines, while the BIS section takes the Pitch F/x data and uses human scorers to modify classifications. Here are the batter plate discipline leaderboards according to BIS data, and here are the same leaderboards according to PITCHf/x data. Context: Please note that the following chart is meant as an estimate, and that league-average for all of these stats varies on a year-by-year basis. To see the league-average plate discipline stats for every year from 2002 to the present, check the FanGraphs leaderboards. Stat O-Swing Z-Swing Swing O-Contact Z-Contact Contact Zone F-Strike SwStr Things to Remember: These statistics are useful for evaluating hitters and pitchers, with SwStr% being especially important when looking at pitchers. Swinging strike percentage (SwStr% swinging strikes per pitch) should not be confused with whiff rate (swinging strikes per swing). Average 30% 65% 46% 68% 88% 81% 45% 59% 8.5%

Pitch Type Linear Weights


The Pitch Type Linear Weights (Pitch Values) section on FanGraphs attempts to answer the question, Which pitch is a pitchers best weapon? The changes in run expectancy between an 0-0 count and a 0-1 or 1-0 count are obviously very small, but when added up over the course of the season, you can get an idea of which pitch typically yields the best results for a

pitcher. If one pitch is hit especially hard or a pitcher cant locate one pitch for a strike, these problems will show up using Pitch Type Linear Weights. Also, if a pitcher gets lots of strikes and outs with a specific pitch, this success will show up. Youll notice that there are two different types of Pitch Type Linear Weights: total runs by pitch (which is shown as wFB, wSL, wCB, etc.) and standardized runs by pitch (shown as wFB/C, wSL/C, wCB/C, etc.). The first category is the total runs that a pitcher has saved using that pitch. However, it is tough to compare these total numbers since pitchers throw different amounts of each pitch. The second category corrects for this, standardizing the values on a per 100 pitch basis. In other words, when you see wFB/C, that represents the amount of runs that pitcher saved with their fastball over the course of 100 fastballs thrown. Context: A score of zero is average, with negative scores being below average and positive scores being above average. In general, pitches will generally fall somewhere between +20 and -20 runs, with the most extreme pitches touching +/-30. On a per 100 pitch basis, the range shrinks to around +1.5 to 1.5 runs. Again, youll see some extreme scores on either end of the spectrum, but thats the range that most pitches and pitchers fall into. Things to Remember: This statistic has limited predictive power. It can show you what pitches a pitcher has had success with in the past, but you should be careful in extrapolating those results and projecting the future. Its a descriptive statistic, not a predictive one. Beware of sample sizes! Sometimes a pitcher may have a handful of his pitches misclassified, showing that he throws a slider when he really doesnt. Also, if these handful of pitches are really successful or really unsuccessful, that pitcher could show up towards the top or bottom of the wSL/C leaderboard. In other words, before drawing conclusions from the standardized leaderboard, be sure that all the pitchers up there have thrown a large amount of the pitch youre looking at. Pitch Type Linear Weights can also be used to evaluate hitters, seeing which pitches they have had most success against in the past. In this case, the values are flipped; positive values are a good result for the hitter, not the pitcher.

WPA
Most sabermetric statistics are context neutral they do not consider the situation of a particular event or how some plays are more crucial to a win than others. While wOBA rates all home runs as equal, we know intuitively that a home run in the third inning of a blowout is less important to that win than a home run in the bottom of the ninth inning of a close game. Win Probability Added (WPA) captures this difference by measuring how individual players affect their teams win expectancy on a per-play basis. For example, say the Rays have a 45% chance of winning before Ben Zobrist comes to the plate. During his at-bat, Zobrist hits a home run, pushing the Rays win expectancy jumps to

75%. That difference in win expectancy (in decimal form, +.30) from the beginning of the play to the end is Ben Zobrists WPA for that play. If Zobrist strikes out during his next at bat and lowers his teams win expectancy by 5%, his overall WPA for the game so far would be +.30 .05 = +.25, as WPA is a counting statistic and is additive. Context: Technically, WPA values for events that contribute positively to a win can range from about 2% (.02 WPA) to 95% (.95 WPA). The extreme swings in WPA are not terribly common, just as walk-off home runs are exciting events we dont see every day. Cumulatively, season-long WPA is not predictive, making it an ineffective number for projections of a players talent. However, it is a good describer of what happened in the game and how a win was achieved. And since +1 WPA equals 100% in win expectancy, +1 WPA is the equivalent of one win. For MLB regulars, heres a quick breakdown on season-long WPA scores: Rating Excellent Great Above Average Average Below Average Poor Awful Things to Remember: WPA is not highly predictive. Generally, it is not used for player analysis and projecting the future. But it does give us a picture of which players helped their team the most during the course of a game. A fun way to think of WPA is as a storytelling statistic. It highlights the big (and most exciting) moments of a game as well as the players who contributed most to a win (or loss). Like RBIs and HRs, WPA is a counting statistic, meaning that players with more playing time will have more opportunities to accrue a higher WPA. WPA +6.0 +3.0 +2.0 +1.0 0.0 -1.0 -3.0

Get to Know: RE24


by David Appelman - March 14, 2008 RE24 (runs above average by the 24 base/out states): RE24 is the difference in run expectancy (RE) between the start of the play and the end of the play. That difference is then credited/debited to the batter and the pitcher. Over the course of the season, each players RE24 for individual plays is added up to get his season total RE24. Calculation Example: In game 4 of the 2007 World Series, the RE for the Red Sox to start the inning was .52. When Jacoby Ellsbury doubled off Aaron Cook in the very first at-bat in

the game, the Red Sox were then expected to score 1.15 runs for the rest of the inning. The difference or RE24 was .63 runs. Ellsbury was credited +.63 runs and Aaron Cook credited with -.63 runs. Why you should care: RE24 tells you how many runs a player contributed to his team. Its similar to WPA (except in runs), but unlike WPA it does not take into account the inning or score of the game. Therefore, it is a more context neutral statistic. It does however take into account how many runners are on base and how many outs are left in the inning. Variations: REW (run expectancy wins) is RE24 converted to wins.

Offense Defense Pitching Win Probability Principles PITCHF/x WAR Business

LI
During the course of a game, some situations are more tense and suspenseful than others. For instance, we know that a one-run lead in the bottom of the ninth inning is more suspenseful than a one-run lead in the top of the third inning. Batting with two runners on and two outs in the eighth inning is filled with more pressure than batting in the same situation in the second inning. Leverage Index (LI) is merely an attempt to quantify this pressure so we can determine if a player has been used primarily in high-leverage or low-leverage situations. There are many different iterations of LI, including: pLI: A players average LI for all game events. phLI: A batters average LI in only pinch hit events. gmLI: A pitchers average LI when he enters the game. inLI: A pitchers average LI at the start of each inning. exLI: A pitchers average LI when exiting the game. Context: An average (or neutral) LI is 1. High leverage is 1.5 and above, and low leverage is below 1. 10% of all real game situations have a LI greater than 2, while 60% have a LI less than 1. Things to Remember: Leverage Index depends on the inning, score, outs, and number of runners on base. If you go to a players Splits section his FanGraphs player page, you can find how the player performed in low, medium, and high leverage situations. While some players may have performed well in high-leverage situations compared to their average performance, that does not necessarily mean they will continue to produce that way in the future. Clutch hitting is

generally the result of small sample sizes and random variation. A player shown to be very clutch one season does not necessarily mean that he will be very clutch in the next.

Offense Defense Pitching Win Probability Principles PITCHF/x WAR Business

WPA/LI
Before you try and tackle Context Neutral Wins (WPA/LI), make sure you have a solid understanding of both WPA and LI. Got em? Okay, cool. So we know that WPA measures a players offensive contributions via win expectancy, while leverage index measures the average leverage of all these situations. At the end of a game, not all players will have the same LI some will have been in more pressure-filled situations than others. A player with a high leverage index may have a higher WPA simply because they happened to come up more often when the game was on the line. So how can we compare two players contributions to wins? With this in mind, if we divide WPA by LI, we see how much value a player provided regardless of the leverage. This number is called Context Neutral Wins (WPA/LI) because it neutralizes leverage while still measuring wins added (remember: 1 WPA = 100% win expectancy). WPA/LI is calculated over the course of a season for every at-bat and is then summed at the end of the season to provide a player with their total WPA/LI. It is a good way to compare WPA between players. Again, WPA/LI measures how much value a player added to their team regardless of the leverage. Because of this, it is more a measure of a players talent level than WPA. Context: Its somewhat helpful to think of WPA/LI as a win expectancy version of WAR*, although rankings arent exactly the same because context neutral wins are set with zero as average not replacement level. Heres a general breakdown:
Rating Excellent Great Above Average Average WPA/LI 5.0 3.0 1.5 0.0

Below Average Poor Awful

-0.5 -1.3 -2.0

*Although please note, WAR and WPA/LI are far from the same thing they are just on somewhat similar scales. WAR calculates a players value based on offense, defense, the position they play, etc., while context neutral wins focuses solely on a players offensive win expectancy contributions. Things to Remember: WPA/LI allows us to compare WPA between players, but it is still not as predictive of a measure as wOBA or WAR.

Clutch
Clutch measures how well a player performed in high leverage situations. Its calculat ed as such: Clutch = (WPA / pLI) WPA/LI In the words of David Appelman, this calculation measures, how much better or worse a player does in high leverage situations than he would have done in a context neutral environment. It also compares a player against himself, so a player who hits .300 in high leverage situations when hes an overall .300 hitter is not considered clutch. Clutch does a good job of describing the past, but it does very little towards predicting the future. Simply because one player was clutch at one point does not mean they will continue to perform well in high-leverage situations (and vice versa). Very few players have the ability to be consistently clutch over the course of their careers, and choking in one season does not beget the same in the future. Context: The majority of players in the league end up with Clutch scores between 1 and -1, with zero being neutral, positive scores being clutch, and negative scores being choke. Only a few players each year are lucky enough (or unlucky enough) to have extreme Clutch scores. Rating Excellent Great Above Average Average Below Average Poor Awful Clutch 2.0 1.0 0.5 0.0 -0.5 -1.0 -2.0

What is WAR?
Wins Above Replacement (WAR) is an attempt by the sabermetric baseball community to summarize a players total contributions to their team in one statistic. You should always use more than one metric at a time when evaluating players, but WAR is pretty darn all-inclusive and provides a handy reference point. WAR basically looks at a player and asks the question, If this player got injured and their team had to replace them with a minor leaguer or someone from their bench, how much value would the team be losing? This value is expressed in a wins format, so we could say that Player X is worth +6.3 wins to their team while Player Y is only worth +3.5 wins. Calculating WAR is simpler than youd think. If you want the detailed (yet very understandable) version, check out the links at the bottom of the page; Dave Cameron does a good job of walking through the process step-by-step. The short answer, though, is that as follows: Offensive players Take wRAA, UBR, and UZR (which express offensive, base running, and defensive value in runs above average) and add them together. Add in a positional adjustment, since some positions are tougher to play than others, and then convert the numbers so that theyre not based on league average, but on replacement level (which is the value a team would lose if they had to replace that player with a replacement player a minor leaguer or someone from the waiver wire). Convert the run value to wins (10 runs = 1 win) and voila, finished! Pitchers Where offensive WAR used wRAA and UZR, pitching WAR uses FIP. Based on how many innings a pitcher threw, FIP is turned into runs form, converted to represent value above replacement level, and is then converted from runs to wins. WAR is available in two places: FanGraphs (fWAR) and Baseball-Reference (rWAR). Both statistics use the same framework, but are calculated slightly differently and therefore sometimes show different results. The above explanation is for fWAR; see the section below on rWAR for more information on the differences between the two iterations of WAR. Context: League-average WAR rates vary. An average full-time position player is worth +2 WAR, while average bench players contribute much less (typically less than +1 WAR). Average starting pitchers also are worth around +2 WAR, while relief pitchers are considered superb if they crack +1 WAR. For position players and starting pitchers, here is a good rule-of-thumb chart: Scrub Role Player Solid Starter Good Player All-Star Superstar MVP 0-1 WAR 1-2 WAR 2-3 WAR 3-4 WAR 4-5 WAR 5-6 WAR 6+ WAR

Also, heres a breakdown of all the players in baseball in 2010, courtesy of Justin Bopp from Beyond the Boxscore.

Things to Remember: Since there is no UZR data for catchers, the fielding component for catcher fWAR is calculated using two parts: the Stolen Base Runs Saved (rSB) metric from the Fielding Bible, and Runs saved from Passed Pitches (RPP). This accounts for a large portion of a catchers value, although pitch framing is not yet included in WAR. WAR is context, league, and park neutral. This means you can use WAR to compare players between years, leagues, and teams. It is possible to have a negative WAR. In fact, the worst fWAR any player has had since 2002 is Neifi Perez from the Royals, who posted an incredible -3.1 wins in 2002.

Win Values Explained: Part One


by Dave Cameron - December 29, 2008 So, youve probably noticed that the Big New Thing around here is Win Values for position players. David has added them to the player pages, the team pages, and even the leaderboards. Now, instead of sounding like a total nerd by telling your friends that Pujols is awesome because he had an .843 wOBA (not his real number, but not out of the question), you can simply tell them that he was a nine win player last year, worth about $40 million in 2008 salary. Nine wins. Forty Million. Everyone understands these numbers. Thats the beauty of win values we can express a players contribution to his team in ways that are both meaningful and easy to understand. As much as I love WPA/LI, its just never going to be something that the casual fan is going to understand without a good bit of

explanation. Win values, though I can tell my mom that Adrian Beltre is a four win player and shell understand in 30 seconds. And, without too much more explanation, I can explain that those four wins are worth about $18 million in salary, and so not only is Beltre worth his salary, but hes actually something of a bargain. Win Values are a big open door to acceptance of our particular brand of analysis among nonstatheady fans, and even within our little insulated community, theyre still a big step forward over the commonly accepted performance metrics of the last few years. However, rather than just telling you that and having you trust us, we figured itd be a good idea to explain how the win values are calculated and break down each part of the formula for you to see. So, this week, well be looking at the calculations of each part and walking everyone through the steps to create a win value for a particular position player. This afternoon, well start with the Batting component. Sticking with the Adrian Beltre example, we see that he hit .266/.327/.457 last year. However, while thats interesting, do you know how valuable that is just by looking at it? Me either. Thats why theres wOBA, which takes all the results of a players plate appearances throughout the year and uses run value weights to sum up a players production at the plate in a number that is easily converted to runs above average. That number, wRAA, is found right next to wOBA on each players page. It is, essentially, a players value that he produces at the plate relative to a league average hitter. You can read more about wOBA in The Book, and we went into detail about it a while ago. However, you may notice that a players Batting value doesnt match his wRAA value on the player cards. Thats because the wRAA numbers on the site are not park adjusted, but to build a proper win value, you have to include the effects of a players home environment. Getting back to Beltre, he plays in a park that depresses run scoring, so the runs that he creates are more valuable than they would be if they came in a park where runs were more plentiful. So, while his raw offensive line may have only been worth 3.9 runs, when we adjust for Safeco Field, his Batting value goes up to 5.9 runs. That number the 5.9 in Beltres case represents the amount of runs above or below average that each player created with their bat for a given year. This number is not position adjusted, as I detailed my issues with offensive position adjustments back in November. Well add the position adjustments in later, so it gets included in a players total value, but I think its incorrect to add it to the offensive total. So, when youre looking at the Win Values section, thats Batting offensive runs above or below average, not position adjusted, but adjusted for the run environment of his home park.

Win Values Explained: Part Two


by Dave Cameron - December 29, 2008 This afternoon, we looked at the offensive side of the Win Value equation. Linear weights are pretty widely accepted now, and since wOBA and wRAA match up so well with what people understand about offensive value (Albert Pujols had a good year, Jeff Francoeur did not), its not a big surprise that there isnt much questioning that part of the formula. We get into a bit stickier water when we move to the Fielding side of things, however.

In the Win Values calculations here, Fielding is fairly straight forward its simply a players total UZR at all positions for the given year. We talked a lot about UZR the last few weeks after it was added to the site, but if youre looking for a more detailed explanation of the system, the introduction can be found in part one and part two. Essentially, its the best fielding metric publicly available, and while its not perfect (I generally give it an error range of five runs in either direction, meaning that a +10 could be anything between a +5 and +15), its a big step forward in defensive evaluations. UZR, unlike the offensive component wRAA, is relative to the league average of the position for that player. We talked about this a few weeks ago in talking about how to read the UZR numbers. +15 in LF is simply not an equal performance to +15 in CF, as the players they are getting compared to are drastically different. This creates the need for position adjustments to account for the difference in quality between positions. However, when displaying the Win Values here, weve broken them out into separate components to be as transparent as possible, so the Fielding numbers do not include the position adjustment. Well get to those next. The fielding total is simply a sum of the players UZR from each position he played in the given year. The current version of UZR on FanGraphs does not include a few minor things, such as the value of arm strength and turning double plays. Some of these will be added in soon as MGL updates the data, but the changes are going to be minor despite the emphasis put on it by many, there just isnt a huge difference betwee most major league players in terms of the runs saved through their throws from the outfield. However, if you feel like a particular player has an exceptional arm and should be rewarded for it, feel free to add in a couple of runs to make up for the fact that UZR doesnt include that portion of his defensive value. The other important point to make here is that youll notice that catchers have no values entered in the Fielding portion of their Win Values. Evaluating catcher defense is something were simply not very good at right now, and while there are strides being made (including a great article by Tom Tango in the 2009 Hardball Times Annual), theres a lot of things that we havent figured out how to quantify yet. So, weve just left catchers alone, ranking them all as league average, and will let you all adjust their final win value however youd like to reflect their defensive value relative to other catchers. If you think Joe Mauers catching abilities and leadership are worth one win, just add one win to what we display as his win value here. Quantifying catching defense is something that we just havent figured out yet, and so were not pretending that we have. Consider it an opportunity to fill in the blanks. Tomorrow, well look at the position adjustments and how the wRAA, UZR, and position adjustments add up.

Win Values Explained: Part Three


by Dave Cameron - December 30, 2008 Continuing on with our series explaining win values, today, we get back to positional adjustments. We spent a lot of time talking about them the last few weeks, so if you havent read those articles, Id suggest catching up first. A basic summary of the need for position adjustments follows below, for those who want a short version.

Since we started off with wRAA (which is offensive runs above or below league average) and UZR (which is defensive runs above or below league average at a specific position), we need to calibrate the scale to make up for the fact that some positions are significantly harder to play than others. It is much harder to find a +5 SS than it is to find a +5 2B, and we need to represent that in the Win Value system. Thats what the position adjustments are there for. Traditionally, offensive position adjustments have been popular, which aligns the positions by adjusting on the basis of the difference in offensive runs. However, due to the variability in offensive performance from year to year, that can lead to miscalculations, such as believing that an NL 2B and an NL SS were equal in 2008 because they had the same batting line. Clearly, shortstops are better defenders than second baseman, and we have to reflect this in their value. Thats why we prefer a defensive position adjustment. The position adjustment scale we use is as follows: Catcher: +12.5 runs (all are per 162 defensive games) First Base: -12.5 runs Second Base: +2.5 runs Third Base: +2.5 runs Shortstop: +7.5 runs Left Field: -7.5 runs Center Field: +2.5 runs Right Field: -7.5 runs Designated Hitter: -17.5 runs To read more about how these were arrived at, check out these threads at The Book blog. The position adjustments are then scaled to match the games played at each position for a particular player. This way, players that spend time at multiple positions get a hybrid adjustment based on their playing time at the respective spots. Once you add the wRAA, UZR, and position adjustment together, you have the sum of a players value above or below league average. If we used Chase Utley as an example, he gets 37.1 wRAA, 19.2 UZR, and 2.3 Position Adjustment for a total of +58.6 runs. In 2008, Chase Utley was 58.6 runs better than a league average player. If you want to start handing out credit for the World Series title in Philly, give him the most, because thats outstanding. However, now we have value above or below average, but what is average worth? Clearly, its worth more than zero, as teams pay significant cash for league average players every winter, and a team full of league average players would win 81 games and generate positive revenue for their franchise. But, in terms of dollar values, we dont have a fixed baseline for what a league average player is paid. However, you know what we do have a fixed baseline for? The league minimum player. MLB has set $400,000 as the least any player can get paid, so we know that a player who is completely replaceable is worth $400,000. That makes replacement level a good target to calculate value off of. So, this afternoon, well talk about replacement level and how that is defined.

Win Values Explained: Part Four


by Dave Cameron - December 30, 2008 Okay, so, in the first three parts, weve covered Batting, Fielding, and Position Adjustments, and hopefully youve been able to see how were arriving at the values used for each component. By combining those three parts, you get runs above or below average. However, as I mentioned at the end of the last post, we dont really know how much average costs, but we do know how much replacement level costs, so we prefer to value players above replacement, as that gives us a fixed baseline of $400,000 in salary the league minimum. For a great read on replacement level, check out this article by Sean Smith. In it, he uses his CHONE projections to figure out the offensive production that a team could expect from players not projected to be good enough to make a major league roster next year. These guys have fallen into that Four-A category, where they show more ability than your average TripleA veteran but not enough to hold down a major league job. Theyre usually available every winter as minor league free agents, via the Rule 5 draft, or as cheap trade acquisitions where a team can acquire one of these players without giving up any real talent in return. As Sean showed in his article, and has been shown elsewhere, the expected value of a replacement level player is about negative 20 runs per 600 PA. Or, to phrase it a bit differently, if you lost a league average player and replaced him with a freely available guy, youd lose about two wins. Thats why the replacement level calculation in our Win Value formula is 20/600*PA. If you get exactly 600 PA during a season, your replacement level adjustment will be +20 runs. If you get 700 PA, your replacement level adjustment will be +23.3 runs. The more you play, the higher the replacement level adjustment, because youre filling a larger quantity of playing time and that chunk wont need to be filled by anyone else. The replacement level calculation serves to do two things in our calculations adjust the scale so that the baseline value is $400,000 at zero wins, and rewards players who stay on the field. For instance, Chipper Jones was outstanding in 08, posting a .446 wOBA and a +4.9 UZR. However, he only racked up 534 PA, so the Braves had to give approximately 66 PA to people who werent Chipper Jones. Therefore, Chippers replacement level adjustment is just 17.8 runs we presume that the folks who filled in for him were about 2.2 runs below average in those 66 PA, and that comes out of Chippers replacement level adjustment. Players who stay healthy and can take the field everyday have value above and beyond their rate statistics, and scaling the replacement level adjustment to plate appearances rewards them for that extra value. If youre having a tough time visualizing what a replacement level player looks like, theres probably not a better example in baseball than Willie Bloomquist. Over the last three years, hes racked up 644 PA just barely more than one seasons worth and accumulated the following totals: -16 batting, -3.8 fielding, +0.9 position adjustment = -18.9 runs below average. Hes not a very good hitter, but he can play a bunch of positions, run the bases okay, and doesnt cost much. He is, essentially, the poster boy for replacement players. By adding in the replacement level adjustment, were simply adjusting from saying that Chase Utley is +58 runs above average to +78 runs above Willie Bloomquist. And, since we know that players of

Bloomquists quality are available for $400,000, we can then value Utleys performance based off that baseline. So, thats wRAA+UZR+Position+Replacement. It comes out as Value Runs, and tells you how many runs above a replacement level player each position player was. Tomorrow, well talk about the runs to wins conversion and the wins to dollars conversion.

Win Values Explained: Part Five


by Dave Cameron - December 31, 2008 For the last couple of days, weve been talking about the different components of the Win Value system. However, you may have noticed that weve been dealing entirely in runs. wRAA, UZR, the position adjustment, and replacement level are all expressed in terms of runs, not wins, and thats why the sum of those numbers is categorized under Value Runs. So, if all of our metrics deal in terms of runs, but we want to get to wins, we need to know how many runs it takes to make a win. This is actually quite a bit easier than it sounds, thanks to the pythag formula for expected win-loss records. For those not aware of pythag, you can get a good estimate of a teams winning percentage by dividing the square of runs scored by the sum of the square of runs scored and the square of runs allowed. Or, in formula terms: RS^2/(RS^2 + RA^2) = Pythagorean Winning Percentage. So, if a team scored 775 runs and allowed 775 runs, theyd have a .500 Pythag Win%, or 81 wins and 81 losses even amounts of runs scored and runs allowed should lead to something like an even record. Not as scary as it sounds. What happens if we subtract 10 runs from the runs scored column, so that we now have a 765 RS/775 RA team? Pythag spits out a .4935 win%, and .4935 * 162 = 79.95 wins. So, instead of 81 wins, youre now expected to win just barely less than 80. By subtracting 10 runs, you lost a fraction more than one win. Same thing happens if you add 10 runs to the runs allowed column 775/785 RS/RA spits out .4935 as well. How about if you add 10 runs, so we have a 785/775 team? .5064 win%, or 82.03 wins. Again, 10 runs added equals one win gained. For an even more precise look at the issue, you could use the improved PythagenPat method, which places a better exponent in the calculation, but the conclusion is going to be the same; 10 runs = 1 win. So, when you see value expressed in runs, but you want it in wins, just divide by 10. Likewise, if you see it in wins but you want it in runs, multiply by 10. It might sound like a cheap trick, but its reality 10 runs add up to a win. A +20 run player is a +2 win player.

Win Values Explained: Part Six


by Dave Cameron - January 2, 2009

Over the first five parts of this series, weve discussed all the components of what makes up a Win Value. Today, we tackle the conversion of that win value into a dollar value. First off, a little background. Since weve set replacement level at around a .300 win% (or 48 wins per team), that means that there are about 1,000 marginal wins in a major league season. All 30 teams are fighting over these 1,000 wins, each trying to get more than 45 or so to get them in the playoffs. Every dime a major league team spends above the major league minimum is theoretically spent in an effort to buy as many of those 1,000 wins as possible. A major league teams minimum payroll is about $12 million, so MLB as a whole has a floor of $360 million in salary per season. Total payroll for MLB teams in 2008 was reported at $2.67 billion. That means that major league teams spent $2.31 billion to try to buy their share of those 1,000 marginal wins. Basic division tells us that the cost of a win in MLB salary was $2.31 million per win for 2008. However, a huge share of those wins were created by players whose salaries were not determined by a free market system. Every player with zero to six years of service time had an artificially depressed salary due to not being able to qualify for free agency. As well, most players who signed long term contracts that bought out some of their arbitration and free agent years had salaries below market value as well they had traded some potential cash for the security of a deal several years ago. The amount of money that teams are paying per win for their cost controlled players is far less than the $2.31 million league average. So, the market of wins available for purchase doesnt total 1,000. A significant batch of MLB players simply arent available for acquisition at any given time. The Cardinals arent trading Albert Pujols. The Mariners arent trading Felix Hernandez. The Rays arent trading Evan Longoria. The wins that these players generate are not for sale. Who is available? Obviously, players who qualify for free agency in a given season are available. Also, there are players traded from one club to another, so those players are also available for the right price. But what is the right price? In general, we can say that the market price of a win is the mean of the dollars per win handed out to free agents in any given year. If you approached CC Sabathia this winter and offered him $12.65 million because he was a 5.5 win pitcher and the league average cost per win is $2.3 million, you wouldnt have gotten very far. If you want to compete in the market for available wins, you have to know what the going rate for a win is, and the easiest way to calculate that is to look at the free agent market. Lets look back at 2007, for instance. 90 free agents signed major league contracts last winter, ranging from Alexs Rodriguez $275 million deal to Josh Towers $400,000 contract with the Rockies. The sum of those 90 contracts paid out $396 million in 2008. To figure out what the average cost per win of a 2007 free agent was, though, we need to know how many wins that group was worth. To calculate this, I did a three year weighted average of their win values, then multiplied that value by .95 to factor in aging and estimate what teams considered considered a players true talent win rate for 2008. In total, I came up with 88 wins, or $4.5 million per win. Thats what major league teams were paying for a marginal win last winter, so for 2008, thats a players

dollar per win value as listed on the site. I re-did this for all years going back to 2002, and the dollars per win for each are as follows: 2002 $2.6m / win 2003 $2.8m / win 2004 $3.1m / win 2005 $3.4m / win 2006 $3.7m / win 2007 $4.1m / win 2008 $4.5m / win Now, I know theres some sentiment that teams dont pay for wins linearly, because a six win player is worth more than three two win players. While I agree with this in theory, major league teams just dont operate this way. If you just look at the dollar per win costs for the multi-year contracts handed out to hitters last year, the cost per win was $4.3 million for guys with an average win value of 4.4 wins per player. Alex Rodriguez signed for about $3.8 million per win last year. Teams just dont pay exponentially more for higher win value players than they do for average and below players. You could argue that they should (and I would probably agree), but they dont. The dollar per win scale is linear. This afternoon, well look at the opportunities that are presented to teams due the linear nature of dollar per win, and how the smart teams are exploiting this to their advantage.

Win Values Explained: Part Seven


by Dave Cameron - January 5, 2009 Before moving on, I wanted to do one more post on the Win Value series we covered last week, emphasizing a few points that may have got lost in the shuffle. While we think these win value stats are a tremendous addition to the site and should be extremely useful, we also want to maintain integrity in how we talk about them and the ways they are used. So, with that said, heres some things to keep in mind. All catchers are assumed to be average defensively. This is obviously not true, but in terms of quantifying catcher defense, were just not there yet. We have a pretty good idea that most major league catchers fall somewhere between -10 runs and +10 runs, based on their ability to block balls in the dirt, control the running game, and so forth. So, as a general guideline, if you think the catcher is awful defensively (maybe Ryan Doumit is a good example), knock one win off. If you think hes just below average (Ramon Hernandez?), knock off half a win. If you think hes above average (Kurt Suzuki?), add half a win. if you think hes outstanding (Joe Mauer?), add a full win. There are a few things not included. The only aspect of baserunning that is currently included is SB/CS. Throwing arms and turning double plays are currently not included in the fielding evaluations. In general, no ones going to be more than +5 or -5 in these minor areas, but for guys at the extremes, it could be half a win or so.

Were measuring past performance, not necessarily true talent level. Just because Jayson Werth put up a +5 win season in 2008 does not mean that were saying he is a +5 win player. It is pretty common for people to play above or below their actual level of abilities. Dont get too wrapped up over a single season performance. The leagues are not necessarily even in talent level in every year. For recent years, theres a good bit of evidence that the AL has been better than the NL. It may even be slightly more accurate to use league specific replacement level adjustments, especially for the 05-07 time period. Well work on trying to quantify the differences in leagues going all the way back to 02 so that we could potentially include the league differences later on. The dollars to win adjustments arent super easy to calculate. Reasonable people can differ on what the market value of a marginal win was in different years. I think my methods work pretty well, but they arent perfect. The margin for error is probably around $500,000 in each direction for recent years. Most importantly, were not claiming decimal point accuracy with these win values. If someone is listed at 4.8 wins, and someone else is listed at 4.3 wins, there could be enough mitigating factors that the lower win value player was actually more valuable. When the differences are less than one win, dont be dogmatic about your conclusions. I generally use whole number win values anyway, and I think were best served saying that were aware of some of the things we havent covered yet, and that theres some wiggle room in the numbers. Make no mistake I think these are the best single value metric for evaluating a player on the internet today. Id use a players Win Value number to describe his total performance before I used anything else. But were not saying theyre perfect or that they cant be improved upon. Well keep working on getting better data, figuring things out, and making them even more accurate in the future. Right now, theyre great. Hopefully, by this time next year, theyre even better.

Win Values Explained: Part Eight


by Dave Cameron - January 7, 2009 I have to say, weve really loved watching how the Win Value stats we introduced a few weeks ago have taken off around the blogosphere. It was definitely a project we were excited about, and Im personally quite pleased that you guys have responded to them as well as you have. I did want to mention a couple of things, though. Yes, win values for pitchers are coming. Were actively working on making sure we have the most accurate formula we can for calculating them, and because of things unique to pitchers, they simply arent as straight forward as hitter win values. Theres the starter/reliever issue, how leverage should be handled, the DH issue between leagues, and the various issues of what is within a pitchers ability to control and what could be considered outside influence. When we introduce the

pitcher Win Values, though, well definitely walk through them step by step, as we did with the hitters, and try to make them as transparent as possible. I know that a lot of you are already calculating these win values on your own as well, using various inputs, especially in terms of projecting how teams will fare in 2009. Thats great, and definitely one of the fun things you can do with a Win Value system. This isnt intended to dampen your enthusiasm for these metrics at all. However (and I know you could feel that word coming), I think there are a few things we should mention in regards to adding up Win Values for a roster. First off, wins arent entirely linear. A player who is projected as a +2 win player wont have the exact same impact on a 60 win roster that he would on a 95 win roster. Theres diminishing returns that start kicking in, and there are only so many at-bats and high leverage innings to go around. And, of course, the specifics of the players skillset interact with his environment, so a change in environment could change his value. For instance, taking a flyball pitcher and sticking them in front of the worst outfield defense in the world is going to have an impact on the value of a +2 win pitcher. By just adding up individual player win values, we lose these contexts, and they matter. Also, as many of you have noticed, the position adjustments dont add up to zero. For each individual player, this isnt a problem. However, the fact that the AL has a DH and the NL doesnt makes it an issue when trying to compare teams across leagues. Its not that hard to adjust for, but it shouldnt be left out of the discussion if you start doing Win Value evaluations or projections for all 30 major league clubs. Just a few things to keep in mind as we all bask in the awesomeness of the availability of these metrics.

Pitcher Win Values Explained: Part One


by Dave Cameron - January 12, 2009 Since we released the Win Values for hitters here on the site, one of the main questions was when we were going to add them for pitchers. The answer: today. If you go to a pitchers player page here on FanGraphs, youll see the newly added Value section down at the bottom. For example, heres Johan Santanas Win Values for the past five years: 2004: +7.6 wins 2005: +7.2 wins 2006: +7.1 wins 2007: +3.4 wins 2008: +4.6 wins During his stretch of dominance with the Twins, he was consistently amazing. He took a step back in his last year in Minnesota though, and while he rebounded somewhat this year, he hasnt been the same elite guy that he was in his prime during the last two years. Still very

good, certainly, as a +4.6 win pitcher is among the best in the game, but not quite the guy he was from 2004 to 2006. Other fun pitchers to look at: Brad Lidge (+3 wins from a closer in 08 quite the addition for Philly), Barry Zito (the Giants should have seen this coming), and Ben Sheets (seriously, someone should give this guy some money). So, now, for the obvious question how on earth did we come up with these things? Starting tomorrow, well do a week long series explaining the calculations behind pitcher win values and the questions that arose during the process. Theyre far more complicated that hitter win values, honestly theres issues of run environments, leverage, and context that had to be accounted for, and in many cases, the decisions of how to handle these things arent cut and dried. So, over the next few days, well dig into those issues and talk about how we arrived at the values we did, and what the positives and negatives of those decisions are.

Pitcher Win Values Explained: Part Two


by Dave Cameron - January 13, 2009 As we announced yesterday, win values for pitchers are now available on the site. As before, were going to go through the process of explaining the calculations that lead to the values you see here on FanGraphs and lay the foundation for understanding what these win values represent. To start with, lets take a look at the main input that goes into the win value calculation a pitchers FIP, or Fielding Independent Pitching, which calculates a pitchers responsibility for the runs he allows based on his walks, strikeouts, and home runs allowed. The FIP formula is (HR*13+(BB+HBP-IBB)*3-K*2)/IP, plus a league-specific factor that scales FIP to match league average ERA for a given season and league. For the win value purposes, we modified the league specific factor to scale FIP to RA instead of ERA. Why did we use FIP? I know this a popular question, and its something I wrestled with myself. However, what I couldnt get away from is that we wanted the context sensitivity for the position player and pitcher win values to be as close as possible. wRAA, the offensive input into Win Values for position players, is context-neutral a hitter does not get credit for his situational performance, such as hitting well with runners in scoring position. Since we arent giving hitters credit for situational performance, we cant give it to pitchers either, in order to maintain the same situation neutral scale. This is going to lead to some questions were aware of that. Claiming that Javier Vazquez was a +5.2 win pitcher in 2006, when traditional metrics will tell you that he went 11-12 with a 4.84 ERA, is going to be a tough sell. We know. However, the tangled web of responsibility for run prevention is not accurately unraveled by simply giving pitchers credit and blame for all earned runs and fielders credit and blame for all unearned runs. As most of you know, there are so many extra variables that go into a pitchers ERA that the pitcher himself simply doesnt have control over. We have to try to extract the pitchers responsibility from his teams run prevention while hes on the mound.

Using ERA or RA simply adds too many non-pitcher factors into the equation to the point that were no longer just evaluating the pitcher. FIP removes defense from the equation by only looking at three factors that a pitcher has demonstrable control over walks, strikeouts, and home runs allowed. By using FIP, were isolating the pitchers core abilities and evaluating him based on those skills. Now, were not claiming that FIP captures everything a pitcher is responsible for. It is not the perfect contextneutral pitcher run modeler we know that. But when confronted with a choice of including way too many non-pitcher inputs or leaving out a few minor actual pitcher inputs, the latter was the better choice. You will get more accurate win values for a pitcher using FIP than you will ERA or RA. Getting back to Vazquez for a second his 2006 FIP was a full run lower than his ERA. The driving forces behind his struggles were a .321 BABIP and a 65.8% LOB%. Most everyone would agree that we dont want to penalize him for poor defense played behind him, but how do we untangle the responsibility for the lack of stranded runners? Vazquez was horrible with men on base in 06, but most of that was BABIP related a .343 BABIP with men on versus a .284 BABIP with the bases empty. If were going to say that hes not responsible for his high batting average on balls in play, and the batting average on balls in play was responsible for the lack of runners stranded, than how do we remove the former but not the latter? This is what I mean by a tangled web of responsibility in terms of run prevention. If you wanted to make the argument that the context-sensitive stuff, such as how often a pitcher leaves runners on base, should be included, then you also need to be prepared to fight for WPA/LI as the offensive metric of choice for hitter win values. And honestly, I wont put up much of an argument theres a case to be made for context-sensitive win values as a useful metric, and Id imagine there will be a day that those are publicly available too. But, theres a more compelling argument for context neutral win values, which is what weve decided to present here. What most of us are interested in knowing is how well a player performed in helping his team win, regardless of the performance of his teammates. To answer that, we have to strip out as much context as we can. Think of FIP as the pitcher version of wRAA. wRAA doesnt include non-SB/CS baserunning or situational hitting. FIP doesnt include batted ball data or situational pitching. Neither are perfect, but but both give us the vast majority of the context-neutral picture. That doesnt mean that were set in our ways and that these win values will never be improved upon. If and when a new metric like tRA is proven to be significantly more effective in valuing pitchers (and Im hopeful that it will be, given more data exploration on the topic), we wont be standing here as guardians of the infallibility of FIP. We want to get to the truth, and do so as quickly and as accurately as possible. I will encourage you (especially those of you in the tRA is awesome/FIP sucks camp), though, to not let minor differences cause you to miss the fact that FIP and tRA lead to very similar results. This afternoon, well talk about replacement level for pitchers, how it differs for each league and role, and how we tackled the issue.

Pitcher Win Values Explained: Part Three


by Dave Cameron - January 13, 2009

This afternoon, we talked about why we chose FIP as the metric to base our pitcher win values on. Now, we turn our attention to the other key aspect involved with understanding how many wins a pitcher was worth the value of a replacement level pitcher as the baseline. As we did with position players, were defining replacement level production as the expected performance you could get from players who can be acquired for virtually no cost. This pool of players would include free agents who sign minor league contracts or for the league minimum, rule 5 draft picks, guys claimed on waivers, and minor league veterans who cant shake the Quad-A label. Some walking examples of that group from this off-season would include R.A. Dickey, Clay Hensley, Jason Johnson, Gary Majewski, and Tomo Ohka. Not the most impressive group of guys ever, but thats why theyve signed for nothing. They represent a portion of the free talent community, and thats the group that we want to define as zero value pitchers. Heres a good discussion about the historical quality of replacement level pitchers. As Tom notes, it is extremely important to distinguish between roles. While both involve hurling a ball towards home plate, starting and relieving are still remarkably different. Relievers are, in general, failed starting pitchers who are given an easier task that their skillset will allow them to handle. They are selectively managed to face hitters whom they have the best chance of getting out, and they get to throw at maximum effort on nearly every pitch, giving them greater velocity over their shorter appearances. Nearly every starting pitcher in baseball could be a useful relief pitcher. Very few relief pitchers could be useful starting pitchers. The distribution of pitching talent is skewed very heavily towards the rotation, and because of this and the extra skills required to pitch 5+ innings per start, we use different replacement levels for starting and relieving in order to capture the additional value added by starting pitchers above and beyond simple run prevention. What are those replacement levels? Perhaps its easiest to understand them in relation to a single game. If we assume that a team has a league average offense, a league average defense, and a league average bullpen, and that they are playing a league average opponent, we would expect them to win any single game started by a replacement level starter 38% of the time. At the same time, wed expect a team with a league average offense, league average defense, and a league average starting pitcher, facing a league average opponent, we would expect them to win 47% of any games in which their replacement level bullpen was used. So, we call a .380 win% the replacement level line for a starting pitcher, and a .470 win% the replacement level line for a relief pitcher. However, league average is different for the AL and NL, thanks to the DH, and of course offensive levels vary from year to year. So, how do we take these replacement level winning percentages and compare them to the RA-scaled FIPs we talked about earlier? Well work through those calculations tomorrow.

Pitcher Win Values Explained: Part Four

by Dave Cameron - January 14, 2009 As we talked about last night, pitcher replacement level is set at a .380 win% for starters and .470 win% for relievers. However, because of the differences between the AL and the NL, as well as varying offensive levels over the years, that means that there isnt a fixed mark that we can point to as replacement level FIP that works for each year and each role. However, since weve got the .380/.470 marks, we can derive those numbers with just a little bit of work. Lets walk through the process, first using a 2008 American League starting pitcher as our example. The league average runs per game in the AL last year was 4.78. The FIPs that are displayed on the pitchers player card here at FanGraphs are scaled to ERA, but for the win values, we modified the formula slightly to scale it to match league RA. However, theres a shortcut if you want to take a pitchers traditional FIP and have it match up with the league RA thats dividing his FIP by .92. For instance, a 4.40 FIP divided by .92 will give you a 4.78 FIP. That .92 is the ERA-RA bridge, and allows us to conclude that 4.40 would be a league average FIP in the American League last year. So, a pitcher with a 4.40 FIP in a neutral park would be a league average pitcher. Or, put back into win% terms, a .500 pitcher. Now, because weve set replacement level at .470 for relievers and .380 for starters, we know that a replacement level FIP for an AL reliever will be a lot closer to 4.40 than it will be for a starter. How much closer? Running the numbers through the formula gives us a 4.68 FIP (traditional, not scaled to RA) for an AL reliever and 5.63 for an AL starter. So, if youre looking at a pitchers FIP here on his FanGraphs page, and that pitcher happens to be in the American League, those are the numbers youd want to compare him to in order to see how far away from replacement level he is. For the NL in 2008, the numbers are 4.45 for a reliever and 5.37 for a starter the lack of a DH drives down the leagues offensive level, and so the performance of a replacement level pitcher will appear better in the NL than in the AL. Remember, these are park neutral numbers, so if youre looking at a player who pitched in a park that significantly effects offense, youll have to adjust his FIP to account for the park effects. If the NL starter that we were looking at pitched in a park that suppressed offense by 5%, then a replacement level for that park would be 5.10, not 5.37. Thus, youd want to use the lower replacement level for his home innings, and the league average replacement level for his road innings. Assuming equal distribution, that would make the replacement level FIP 5.23 for that NL starter pitching in a park with a park factor of .95. As you can see, the run environment that the pitcher exists in has a substantial effect on the replacement level value. But the impact of run environments dont stop there, and are further complicated by the fact that starting pitchers have a significant impact on their own run environment. The expected offensive level is a lot lower in a game where Johan Santana is pitching than where Cha Seung Baek is pitching. In order to calculate the runs to wins conversion for each pitcher, we have to take into account that a pitcher impacts his own run environment. Well talk more about this later this afternoon.

Pitcher Win Values Explained: Part Five


by Dave Cameron - January 14, 2009 We ended the last post in this series by talking about run environments. Generally, when you hear someone talk about a run environment, you think of either a specific park that has a notable influence on run scoring (say, Coors Field) or an era of baseball where the offensive level was significantly different than it is today (the dead ball era, for instance). In these environments, the game is a bit different, and runs can be either much more valuable or less valuable in helping win a game, depending on the context of the environment. However, its not just in extreme parks or long ago where the run environment varies from the modern norms. Indeed, the run environment of current baseball varies from day to day depending on which pitcher is on the mound. CC Sabathia, through his dominance with the Indians and Brewers last year, created his own traveling low-run environment. When he took the mound, runs became hard to come by. Through his own abilities, Sabathia created a run environment in his own starts that wasnt that close to the league average run environment of 2008. This presents an issue. If we were to use the standard league average runs to wins conversion based on a normal run environment, wed run into problems. By virtue of creating his own run environment, Sabathia has changed the context of the value of runs in relation to wins. All pitchers do this to an extent, and the further away from league average they are, the more they influence their particular environment. So, if we know that pitchers are changing the relationship between runs and wins in their starts, then we need a dynamic runs to win estimator that adjusts for their individual run environment. This is where the always awesome Tom Tango comes to the rescue, as usual. In talking with him about this, he suggested that we use the formula ((League RA + Pitchers RA)/2)+2)*1.5, which would handle the runs to wins conversion in differing environments. Essentially, that formula averages the pitchers FIP scaled to RA with the league average RA, then adds in the constants to create the run to win conversion for a given environment. If we looked at an average pitcher in the AL, for instance, the formula would give us (4.78 + 4.78)/2, which is of course 4.78. (4.78 + 2)*1.5 gives us 10.17, which is the average runs to wins conversion for the AL in 2008. Now, if we had a pitcher whose park adjusted FIP scaled to RA was 3.00, then the environment would be 3.89 RA, and the runs to wins conversion would be 8.84. As you can see, an excellent pitcher significantly lowers the amount of runs it takes to equal a win. Doing the basic divide runs by 10 thing shortchanges good pitchers and overvalues bad pitchers. So, weve built this dynamic runs to wins conversion tool into the win values you see on the page. I know, this is a level of detail that most of you wont care about and can understandably be tough to wrap your head around, but as we said, we want these win values to be transparent, and this calculation is required if youre trying to re-engineer the values on the site.

Pitcher Win Values Explained: Part Six

by Dave Cameron - January 16, 2009 We took a day off from the pitcher win value explanations yesterday so I could help a friend move (when you need to move on a weekday, call the baseball writing friend with the flexible schedule hes always available), but well tackle park factors this afternoon and wrap up the series on Monday and Tuesday of next week. As mentioned earlier, the win values are based on a park adjusted FIP. However, we never covered how we handled the park factor. There are lots of different park factors floating around out there, so I figured it would be useful for us to spend a bit of time talking about them. For those that arent aware, a park factor is basically the run environment of a particular ballpark expressed as a decimal, where 1.00 is average. A ballpark with a park factor of 0.90 would depress run scoring by 10%, so that if the league average runs per game is 5.00, then the runs per game in that park would be 4.5. On the flip side, a ballpark with a park factor of 1.10 would have an average of 5.5 runs per game. Park factors are determined by the relative offensive level between each park and the league average. One of the common misperceptions about park factors is that they will be overly influenced by the home team. However, because the home team plays equal amounts of games per season in their home park and on the road, and the visiting teams also play 81 games per year in that park, we get a decent sized sample with which to understand how parks affect run scoring. That doesnt mean that there isnt noise in a single years park factor, however. Lets take Turner Field in Atlanta as an example, for instance. Here are the single year park factors for that park since 2002: 2002: .88 2003: 1.04 2004: .94 2005: 1.01 2006: 1.02 2007: .95 2008: 1.01 Thats a six year average of .98, which makes it just barely below average in term of runs per game, but it obviously hasnt been very consistent from year to year. The 2002 to 2003 change, especially, would suggest that the park went from being something like Petco Park to being more like Fenway Park. Most parks dont have swings that large, but single year park factors can still be a bit unreliable. So, to calculate the win values, weve used a five year regressed park factor. For 2008, here are the park factors we used for all thirty teams:
Season 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 FullName PF Arizona Diamondbacks Atlanta Braves Baltimore Orioles Boston Red Sox Chicago Cubs Chicago White Sox Cincinnati Reds Cleveland Indians Colorado Rockies 1.09 Detroit Tigers 1.05 1.00 1.01 1.03 1.04 1.04 1.02 0.99 1.00

2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008 2008

Florida Marlins Houston Astros Kansas City Royals Los Angeles Angels Los Angeles Dodgers Milwaukee Brewers Minnesota Twins New York Mets New York Yankees 1.00 Oakland Athletics Philadelphia Phillies Pittsburgh Pirates San Diego Padres 0.92 San Francisco Giants Seattle Mariners 0.96 St. Louis Cardinals Tampa Bay Rays Texas Rangers Toronto Blue Jays Washington Nationals

0.97 0.99 1.00 0.99 0.98 1.00 0.98 0.97 0.98 1.02 0.98 1.01 0.98 0.98 1.04 1.01 1.01

Pitcher Win Values Explained: Part Seven


by Dave Cameron - January 20, 2009 In talking about how we calculate pitcher win values, weve covered FIP, differences in replacement level for each league and role, run environments, the dynamic runs-to-wins conversion, and park factors. What we havent done is walked through an example, from scratch, of how the pitcher win values are calculated. Thats what were going to do today. Well use Felix Hernandez as our guinea pig. In 2008, he threw 200 2/3 innings with a 3.80 FIP as a starting pitcher in the American League. Remember, we noted earlier that the league average runs per game in the AL was 4.78 last year, so we rescaled Felixs FIP to make 4.78 equal league average. Adding in a park adjustment for a half season in Safeco Field (with a park factor of .96), we get a 4.28 neutral park FIP scaled to RA for Felixs 2008 season. Now, we have to figure out the runs to wins conversion based on Felix affecting the run environment he pitches in. To do so, find his innings pitched per start (6.5), subtract that from 18, and multiply that by the league average runs per game. Then, we add to that those 6.5 innings multiplied by his park adjusted FIP, and divide that by 18, and then use Tangos +2*1.5 runs to wins converter. So, the formula for Felix would be ((11.5 * 4.78 + 6.5 * 4.28)/18)+2)*1.5, which would give us a run environment of 9.90 runs per win. So, for every 9.9 runs he saves, he gets credit for one win. His 4.28 FIP is 0.50 runs per nine innings better than league average. What does Felixs 4.28 FIP translate into in terms of win%? 0.50 divided by 9.90 equals .050. Add that to .500 and we get .550, making Felix a .550 win% pitcher. Remember, an average pitcher would post a .500 win%, and a replacement level starting pitcher would post a .380 win%. So now we subtract .380 from .550, and we get Felix as .17 wins better than a replacement level starter every nine innings. Factoring in his actual innings pitched, we get .170 * 200.67 / 9, which comes out to 3.8 wins. Thats his wins above a replacement level starting pitcher, or what we call his win value for 2008. Remember, though, these are context neutral win values. Actual wins contributed to a

teams ledger will also be affected by how each pitcher performed with runners on base, as well as the performance of the defenders behind the pitcher. There are going to be cases where a pitcher has a much better (or worse) context neutral win value than you might expect if youre used to looking at his W-L record or his ERA. That does not mean these win values are wrong. Weve removed the situational context of the pitchers performance, just as we do for hitters. Pitchers can either underperform or outperform their win values with extreme performances in clutch situations. We can measure the differences in these situational performances by looking at a pitchers WPA or WPA/LI and comparing it to his Win Value. For too long, weve lacked a resource for context neutral win values for pitchers, having to settle for situational win values that include a lot of variables. These pitcher win values offer us a great opportunity to explore more of what is in a pitchers control and what is not.

Introducing Fielding Dependent Pitching


by Dave Cameron - August 29, 2012 A few minutes ago, David Appelman announced the launch of several new stats here on the site, and since they hit on a topic of frequent discussion, I wanted to go into a bit more depth on our thought process behind their creation and what we see as their role in the evaluation of pitching. Over the years, FanGraphs hasnt been shy about promoting the concept of DIPS, which showed that most of the variance in a pitchers abilities can be viewed through the prism of walks, strikeouts, and home runs. We often cite a pitchers FIP Fielding Independent Pitching, if youre into proper names when talking about his performance, and for most Major League pitchers, FIP works really well as an evaluator of their contribution to run prevention. However, because FIP only focuses on walks, strikeouts, and home runs, it does not include all aspects of run prevention. Specifically, it takes no stance on two aspects of the game that do have a significant impact on a pitchers total number of runs allowed the results of batted balls that are not home runs and the effects of sequencing of the various events. Because the spread in talent among Major League pitchers is not as large in these areas as the spread is in the components of FIP, ignoring these two areas doesnt have a drastic result on the evaluation of most pitchers. However, there is certainly a subset of Major League pitchers who do accumulate (or fritter away) value through their performance in these two categories. So, today, were introducing a set of metrics designed to help quantify the affects of run prevention that are not so easily isolated as the result of a pitchers actions. Because these metrics essentially serve to capture the value that FIP does not, were calling the sum of these metrics Fielding Dependent Pitching. The idea for FDP was to quantify the remaining aspects of run prevention that are not measured by walks, strikeouts, and home runs. With a FIP-based WAR, we have a metric that tells us how many wins a pitcher added through success in those three key areas. What we did not have was a metric that gave us the wins added through either hit prevention or runner stranding. With FDP, we wanted to be able to break down the remaining aspects into those

two categories, so that we could identify exactly where a teams run prevention with a specific pitcher on the mound was coming from. To do that, we simply worked backwards. First, we calculated the total WAR that a pitcher would receive credit for if he was only evaluated by his runs allowed, and we assumed that he had 100 percent responsibility for every variable that influenced run scoring. That stat is now on the site, and is called RA9-Wins. If you do not want to consider any impact of fielding on run prevention, and solely want to evaluate a pitcher by what actually happened when he was pitching (accounting for park and league adjustments, at least), then this is the metric for you, However, for those of you who want to look at a pitchers contribution to run prevention in a more detailed way, we are also adding the two components of FDP to give you a better view of just how a pitcher is going about preventing runs. To do this, we decided to calculate the linear weight value of singles and doubles the difference between a double and a triple is almost certainly not due to something the pitcher did, and thus the value of advancing that extra base was not included the same way that we calculate the value of walks, strikeouts, and home runs, so that we could quantify the wins added or lost that can be credited to a pitchers results on balls in play. Regardless of how you want to apportion the credit for those results, it is helpful to know what the value of those turning those hits into outs (or vice versa) actually is. Just as with WAR, these numbers are park adjusted and then converted into a number of wins added. These are on the site as BIPwins, and can be thought of as the amount of wins a pitcher saved through his hit prevention, or lack thereof in many cases. Once we knew the win value of a pitchers hit prevention, the remainder of his FDP could essentially be described as runner stranding. Now, this is not one particular skill, as there are many ways to skin this cat, but represents the value added through various skills that all essentially lead to the same result not allowing baserunners to cross home plate. Some pitchers achieve this through effective control of the running game, picking off runners and refusing to let them advance through stolen bases. Other pitchers do this by simply altering the way they pitch with men on base, increasing the amount of only-semi-harmful walks they allow in order to reduce the amount of very-harmful home runs they allow. And still others simply seem to excel (or fail) at pitching out of the stretch relative to their peers, and demonstrating significant differences in their performance with the bases empty and with men on base. No matter how they get there, however, the result can be measured by taking the remainder of a pitchers FDP that is not measured by his context-neutral hit prevention. This is called LOBwins on the site, and serves as the value of wins added through all the miscellaneous ways a pitcher can strand runners. BIP-wins and LOB-wins can be thought of as the components of Fielding Dependent Pitching, and represent the part of keeping runs off the board that arent measured by FIP. By definition, the sum of a pitchers FIP-wins and FDP-wins will equal his RA9-wins, so you can essentially see total run prevention through this basic formula: RA9 = FIP + BIP + LOB

So, thats the somewhat boring explanation part of the introduction. Now, lets get to the fun stuff and actually play with the data. In the last 10 years, here are the top five and bottom pitchers in total FDP. Johan Santana: +11.4 wins Tim Hudson: +10.9 wins Ryan Franklin: +10.4 wins Matt Cain: +9.2 wins Jered Weaver: +9.2 wins Derek Lowe: -10.0 wins Mark Hendrickson: -8.5 wins Ricky Nolasco: -8.3 wins Jeremy Bonderman: -8.2 wins Sidney Ponson: -7.1 wins These names are probably familiar to you if youve had any kind of discussion about the validity of FIP in the last few years. The four big names in the top five are the most often cited as pitchers who FIP underrates, and FDP shows just how large the gap is between their FIP-wins and their RA9-wins. Meanwhile, the guys on the bottom of the list are notorious underachievers, each of whom has been derided for failing to live up to their expected potential. As you can see, FDP returns the results you might expect if you were to look at the biggest FIP outliers of the last decade. However, this is also an example of why breaking FDP down into BIP-wins and LOB-wins is useful, as we can present this same list, just showing where those wins added or lost came from. Pitcher Johan Santana Tim Hudson Ryan Franklin Matt Cain Jered Weaver BIP-wins 10.7 8.3 5.0 11.1 7.5 LOB-wins 0.7 2.6 5.4 (1.9) 1.6

Santana and Cains extra value has all come entirely through hit prevention. Santanas stranded just about as many runners as youd expect from a low-FIP/low-BABIP pitcher, while Cain has actually stranded fewer runners than youd expect based on his context-neutral stats. Hudson and Weaver both accumulated value in both areas, but got the majority of their value through hit prevention, while Franklin actually got more of his value through runner stranding, though his RA9 is also significant impacted by hit prevention. And now for the laggards: Pitcher Derek Lowe Mark Hendrickson Ricky Nolasco BIP-wins (4.9) (4.6) (4.1) LOB-wins (5.2) (3.9) (4.2)

Jeremy Bonderman Sidney Ponosn

(3.1) (6.9)

(5.0) (0.2)

Here, we see a more even split, with all five pitchers being negative in both areas. However, my suspicion is that being bad at both hit prevention and runner stranding is necessary to show up on an FDP leaderboard, because pitchers who truly awful at one or the other are likely weeded out before they ever make the Major Leagues, or at least spend significant time pitching for a big league club. Thats why the tails are higher at the positive end of the spectrum, as those pitchers success is keeping them in the big leagues longer and giving them more opportunities, while those who fail spectacularly at one of the two aspects of FDP simply dont last long enough to show up on a list of most value lost over a ten year period. Things get more fun if we look at even larger periods of time, however. If we expand the filters to cover the last 50 years, we find examples of guys where FDP tells quite an interesting story. For instance, Jim Palmer with his career 2.86 ERA and 3.50 FIP accumulated an incredible +27.8 wins through hit prevention and +15.5 wins through runner stranding. His +43.2 FDP-wins are, by far, the most of any pitcher in the modern era. Perhaps the most recent example of a similar type of pitcher is Tom Glavine, and hes at +26.1 FDPwins. Palmer is the biggest FIP outlier in the part of baseball history that at least resembles the game that is played today. However, hes not the leader in either BIP-wins or LOB-wins over the last 50 years. The pitcher who got the most value from hit prevention? Charlie Hough, which shouldnt be surprising given the research that has been done on knuckleballers as the strongest exception to the DIPS theory. Perhaps most interestingly, however, is the career of Nolan Ryan, who demonstrates how the two aspects of FDP dont really go hand-in-hand in many cases. Ryan posted a career FIP- of 84 and a .265 BABIP, which should have resulted in dominating results that made him among the best run preventers in baseball. It didnt, though, because Nolan Ryan was atrocious at stranding runners (relative to his own established levels, anyway), posting -30.2 LOB-wins over his career. Certainly, that is inflated to some degree simply through longevity, but there is no mistaking the fact that Ryan consistently posted higher ERAs than FIPs, even though his BABIP was also below average. While Ryan is strange in the magnitude of his inability to prevent runners from scoring, he is representative of the lack of correlation between the two components of FDP. The correlation of BIP-wins and LOB-wins (scaled to IP, so as to account for differences in innings pitched) for all pitchers with at least 50 IP in 2012 is -0.008. In other words, there is no correlation. Of the top 10 pitchers in BIP-wins this year, only two (Jason Vargas and Kyle Lohse) also have positive LOB-wins. Clayton Kershaw is essentially acting as the current-day Nolan Ryan, and has been so ineffective stranding runners that it nearly cancels out all the value added through hit prevention, so that his FIP and ERA nearly match despite the fact that he has a .256 BABIP. This lack of correlation holds up over longer periods of time, too, so this isnt a sample size issue. The components of FDP are measuring two things that are quite different, and very few pitchers stray from the norm in both. In fact, those longer periods of time actually show just how effective FIP is as a measurement of pitching skill. In looking at all 3,951 pitchers who have thrown at least 100 innings in the majors since 1963, the correlation between the FIP-

based WAR and RA9-wins is .96. For most pitchers with long careers, a WAR based on FIP and a WAR based on runs allowed is going to bring you to the same conclusion. However, most pitchers is not all pitchers, and for pitchers like Jim Palmer or, nowadays, pitchers like Jered Weaver, Tim Hudson, and Matt Cain FDP helps us put a number on the mental adjustment weve been making to help compensate for the fact that FIP does not measure a part of run prevention that they have contributed to in a meaningful way. Through adding FDP-wins (and its components, BIP-wins and LOB-wins) and RA-9 wins to the site, we hope that were now presenting a more comprehensive picture of how runs are saved when a pitcher is on the mound. It is still definitively true that runs are mostly saved by limiting walks and home runs and keeping batters from making contact, but FDP fills in the gap between FIP and runs allowed, and gives us a clearer picture of the impact of various performances in the things that arent captured in FIP. In a subsequent post this one is already over 2,100 words, straining the limits of the word introduction that will be up in a few hours, well discuss the role that FDP will play in our pitcher WAR and how we hope the addition of these new metrics will help us better reflect the value that different types of pitchers produce. Im also hosting my regularly scheduled noon chat today, and will make it FDP-centric, answering as many questions about these stats as you guys are interested in asking. For now, though, we hope you enjoy the new tools that are now available, and enjoy perusing the leaderboards and learning new and interesting things that you may not have known before, like how Nolan Ryan was a really good pitcher, but could have been so much better had he performed well with men on base.

WAR for Pitchers


We believe in transparency here at FanGraphs, which is why weve gone to some extreme lengths here in the Library to provide our readers with the tools necessary to personally calculate almost every single statistic available at FanGraphs. And that transparency also applies to the sites hallmark statistic: Wins Above Replacement (WAR). If you would like to calculate WAR values for position players, you can find the necessary details over at this Library page. If youre interested interested in the steps behind calculating WAR values for pitchers, read on:

While it was relatively simple to come up with the offensive statistic to base WAR around for position players wRAA is a well accepted, context-neutral stat for measuring offensive value it was more difficult to settle on how to account for pitching win values. Do you give a pitcher credit for their defense behind them? Do you focus on their runs allowed, or do you strip that away and focus instead on how well they pitched? Or in other words, how much noise do you leave in, and how much do you strip down to signal?

In order to match up with the theory behind WAR for position players, pitching WAR needed to be a context-neutral measure of a pitchers value to their team. With this parameter in mind, Fielding Independent Pitching (FIP) was a natural fit. FIP strips away the influences of team defense, focusing solely how variables that a pitcher has control over. FIP also involves considerably less regression than other ERA Estimators like SIERA and xFIP, making it a better measure of value added. While SIERA and xFIP estimate a players hypothetical home run and BABIP rates based on different criteria, FIP uses a players actual home run rate in its calculations. These factors make FIP a good middle-ground option. It strips away the impacts of defense and measures a pitchers skill, but it doesnt merely regress away abnormal results. If a pitcher should have allowed 20 home runs (based on his regressed home run rate) but actually gave up 30 home runs, he was a less valuable pitcher to his team than a stat like SIERA or xFIP would have you believe. Those two stats are better at predicting the future, but FIP is better at capturing past value. The next step turning a pitchers FIP into a win total is no easy thing. Brief explanations of the steps are below, but you can find more detailed information in Dave Camerons WAR series (linked at the bottom of the page). Replacement Level. While league average changes on a year-to-year basis, replacement level stays the same: a .380 win% is the replacement level for starting pitchers, and a .470 win% is the replacement level for relief pitchers. Replacement level FIP is set each year, and it varies depending on the league the player was in and if they were a starter or a reliever. The American League generally has a higher replacement level, as the DH makes offense more prolific in the AL. So if league-average for American League starters was a 4.40 FIP, then replacement level would be set at the appropriate mark above that for a pitcher with a .380 win% (in this example: 5.63 FIP). Run Scale. Since FIP is calculated so that league-average FIP is always the same as leagueaverage ERA, its on an Earned Runs scale not a Runs Allowed scale. For the purposes of calculating WAR, though, its important to have FIP values on a runs scale, since thats the same scale that offensive WAR is based on (see: wRAA). To convert FIP onto a runs scale, you divide FIP values by .92. Park Adjustments.Replacement level FIP varies depending on the park a pitcher plays in. If a 5.63 FIP was the replacement level for an AL starter, and one ballpark depressed offense by 2%, then the replacement level FIP for that park would be 5.52 FIP 2% lower than the AL replacement level. Pitchers only play half their games at home, though, so they only need to have their FIPs adjusted by half their home park factor. FanGraphs uses a five-year regressed park factor in its calculations, as single-season park factors can be flaky and variable.

Run Environment. If an elite pitcher is on the mound, their team typically doesnt need to score as many runs to win a game as they would if they had an average pitcher on the mound. In this sense, pitchers influence their own run environment, and elite pitchers will have a lower Runs Per Win conversion rate. So while league-average is 10 Runs/Win (just like with position players), star pitchers will have lower Runs/Win rates and replacement level pitchers will have higher Runs/Win rates. The conversion formula is: ((League RA + Pitchers RA)/2)+2)*1.5 Put this all together, and voila, you have your pitcher WAR values! Its not the easiest thing to engineer yourself, but it is possible. And if youd like a detailed example of how to mathematically carry out these steps, see this piece by Dave Cameron. Any other questions? Be sure to read through the entire Pitcher WAR introduction series:

Part 1 Introduction Part 2 FIP Part 3 Replacement Level Part 4 Run Scale Conversion Part 5 Runs to Wins Adjustment Part 6 Park Adjustments Part 7 Calculations

At bat
From Wikipedia, the free encyclopedia Jump to: navigation, search Not to be confused with Plate appearance. This article may need to be rewritten entirely to comply with Wikipedia's quality standards. You can help. The discussion page may contain suggestions.(November 2009)

Ichiro Suzuki at bat

In baseball, an at bat (AB) or time at bat is used to calculate certain statistics, including batting average, on base percentage, and slugging percentage. It is a more restricted definition of a plate appearance. A batter starts with an at bat every time he faces a pitcher; however, the batter gets "no time at bat" in the following circumstances:

He receives a base on balls (BB).[1] He is hit by a pitch (HBP). He hits a sacrifice fly or a sacrifice hit (also known as sacrifice bunt). He is awarded first base due to interference or obstruction, usually by the catcher. The inning ends while he is still at bat (due to the third out being made by a runner caught stealing, for example). In this case, the batter will come to bat again in the next inning, though he now has no balls or strikes on him. He is replaced by another hitter before his at bat is completed (unless he is replaced with two strikes and his replacement strikes out).

Section 10.02.a.1 of the official rules of Major League Baseball defines an at bat as: "Number of times batted, except that no time at bat shall be charged when a player: (1) hits a sacrifice bunt or sacrifice fly; (2) is awarded first base on four called balls; (3) is hit by a pitched ball; or (4) is awarded first base because of interference or obstruction..."[2]

Contents

1Examples 2At bat as a phrase 3References 4See also

Examples
An at bat is counted when:

The batter reaches first base on a hit The batter reaches first base on an error The batter is called out for any reason other than as part of a sacrifice There is a fielder's choice

At bat as a phrase
"At bat", "up", "up at bat", and "at the plate" are all phrases describing a batter who is facing the pitcher. Note that just because a player is described as being "at bat" in this sense, he will not necessarily be given an at bat in his statistics; the phrase actually signifies a plate appearance (assuming it is eventually completed). This ambiguous terminology is usually clarified by context. To refer explicitly to the technical meaning of "at bat" described above, the term "official at bat" is sometimes used.

References
1. ^In 1887, Major League Baseball counted bases on balls as hits. The result was high batting averages, including some near .500, and the experiment was abandoned the following season. 2. ^"Rule 10.01: The Rules of Scoring" (PDF). Official Baseball Rules. Commissioner of Baseball, Major League Baseball. Archived from the original on 19 March 2010. Retrieved 2010-03-25.

At bats per home run


From Wikipedia, the free encyclopedia Jump to: navigation, search

In baseball statistics, at bats per home run (AB/HR) is a way to measure how frequently a batter hits a home run. It is determined by dividing the number of at bats by the number of home runs hit. Mark McGwire possesses the MLB record for this statistic with a career ratio of 10.61 at bats per home run and Babe Ruth is second, with 11.76 at bats per home run.[1]Ryan Howard currently holds third place with 12.104 AB/HR,[2] and currently holds the record among active players.

Contents

1Major League Baseball leaders o 1.1Career o 1.2Season 2References

Major League Baseball leaders


Career

Mark McGwire holds the career record for fewest at bats per home run with at least 3000 plate appearances, at 10.61.

Totals are current through the end of the 2011 season, minimum 3000 plate appearances. Active players in bold.[1]
1. 2. 3. 4. 5. Season Mark McGwire - 10.61 Babe Ruth - 11.76 Barry Bonds - 12.92 Ryan Howard - 13.25 Jim Thome - 13.73

Single-season statistics are current through the end of the 2011 season. Active players in bold.[3]
1. 2. 3. 4. 5. Barry Bonds - 6.52 Mark McGwire - 7.27 Mark McGwire - 8.02 Mark McGwire - 8.13 Barry Bonds - 8.29

Babe Ruth was the first batter to average fewer than nine at-bats per home run over a season, hitting his 54 home runs of the 1920 season in 457 at-bats; an average of 8.463. Seventy-eight years later, Mark McGwire became the first batter to average fewer than eight AB/HR, hitting his 70 home runs of the 1998 season in 509 at-bats (an average of 7.2714). In 2001, Barry Bonds became the first batter to average fewer than seven AB/HR, setting the Major League record by hitting his 73 home runs of the 2001 season in 476 at-bats for an average of 6.5205. Ruth, McGwire and Bonds are the only batters in history to average nine or fewer AB/HR over a season, having done so nine times:
Nine or fewer at-bats per home run[4] Batter Babe Ruth Babe Ruth Season HR AB AB/HR 1920 1927 54 457 8.4630 60 540 9.0000 52 423 8.1346 70 509 7.2714 65 521 8.0154

Mark McGwire 1996 Mark McGwire 1998 Mark McGwire 1999

Barry Bonds Barry Bonds Barry Bonds Barry Bonds

2001 2002 2003 2004

73 476 6.5205 46 403 8.7610 45 390 8.6670 45 373 8.2890

Walk-to-strikeout ratio
From Wikipedia, the free encyclopedia Jump to: navigation, search This article does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.(September 2007)

In baseball statistics, walk-to-strikeout ratio (BB/K) is a measure of a hitter's plate discipline and knowledge of the strike zone. Generally, a hitter with a good walk-to-strikeout ratio must exhibit enough patience at the plate to refrain from swinging at bad pitches and take a base on balls, but he must also have the ability to recognize pitches within the strike zone and avoid striking out. Joe Morgan and Wade Boggs are two examples of hitters with a good walk-tostrikeout ratio.[citation needed] A hit by pitch is not counted statistically as a walk and therefore not counted in the walk-to-strikeout ratio. The inverse of this, the strikeout-to-walk ratio, is used to compare pitchers. Best Single Season Walk-to-Strikeout ratios from 1913-2011:
Rank 1 2 3 4 5 6 7 Player Joe Sewell Joe Sewell Joe Sewell Joe Sewell Team LG Year BB SO BB/SO NYY AL 1932 56 3 18.67 NYY AL 1933 71 4 17.75 CLE CLE AL 1925 64 4 16.00 AL 1929 48 4 12.00

Charlie Hollocher CHC NL 1922 58 5 11.60 Lou Boudreau Eddie Collins CLE AL 1948 98 9 10.89

CWS AL 1925 87 8 10.88

8 9 10 11 12 13 14 15 16 17 18 18

Joe Sewell Eddie Collins

CLE

AL 1926 65 6 10.83

CWS AL 1923 84 8 10.50

Mickey Cochrane PHA AL 1929 69 8 8.63 Joe Sewell CLE AL 1923 98 12 8.17

Tommy Holmes BSN NL 1945 70 9 7.78 Joe Sewell Tris Speaker Joe Sewell NYY AL 1931 62 8 7.75 CLE CLE AL 1920 97 13 7.46 AL 1927 51 7 7.29

Mickey Cochrane PHA AL 1927 50 7 7.14 Tris Speaker Lou Boudreau Tris Speaker CLE CLE CLE AL 1918 64 9 7.11 AL 1949 70 10 7.00 AL 1922 77 11 7.00

Base runs
From Wikipedia, the free encyclopedia (Redirected from Base Runs) Jump to: navigation, search This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.(December 2007)

Base runs (BsR) is a baseball statistic invented by sabermetricianDavid Smyth to estimate the number of runs a team "should" have scored given their component offensive statistics, as well as the number of runs a hitter/pitcher creates/allows. It measures essentially the same thing as Bill James' Runs Created, but as sabermetrician Tom M. Tangopoints out, BaseRuns models the reality of the run-scoring process significantly better than any other "run estimator".

Contents

1Purpose and formula 2Advantages of base runs

3Weaknesses of base runs 4See also 5External links

Purpose and formula


These were described in Smyth's Base Runs Primer.
This section requires expansion. (February 2011)

Advantages of base runs


Base Runs was primarily designed to provide an accurate model of the run scoring process at the Major League Baseball level, and it accomplishes that goal very well: in recent seasons, BsR has the lowest RMSE of any of the major run estimation methods. But in addition, Base Runs can claim something no other run estimator can -- its accuracy holds up in even the most extreme of circumstances and/or leagues. For instance, when a solo home run is hit, Base Runs will correctly predict one run having been scored by the batting team. By contrast, when Runs Created assesses a solo HR, it predicts 4 runs to be scored; likewise, most linear weights-based formulas will predict a number close to 1.4 runs having been scored on a solo HR. This is because each of these models were developed to fit the sample of a 162-game MLB season; they work well when applied to that sample, of course, but are woefully inaccurate when taken out of the environment for which they were designed. Base Runs, on the other hand, can be applied to any sample at any level of baseball (provided you can calculate the B multiplier), because it models the way the game of baseball operates, and not just for a 162-game season at the highest professional level. This means Base Runs can be applied to high school or even Little League statistics.

Weaknesses of base runs


From the TangoTiger wiki: "Base Runs adheres to more of the fundamental constraints on run scoring than most other run estimators, but it is by no means perfectly compliant. Some examples of shortcomings:

BsR will sometimes give a negative estimate; this happens when the B factor is negative. BsR will sometimes project many more than three runners left on base per inning, despite the fact that three is the upper limit. For example, if walks have a B coefficient of .1, an inning with 10 walks and 3 outs will yield an estimate of 10*1/(1+3) = 2.5 runs, meaning that 7.5 runners must have been stranded. Tangotiger's research found that BsR overvalued events within the .500-.800 team OBP range

One avenue for possible improvement in the model is the scoring rate estimator B/(B + C). There is no deep theory behind this construct--it was chosen because it worked empirically. It is possible that a better score rate estimator could be developed, although it would most likely have to be more complex than the current one."

Double play
From Wikipedia, the free encyclopedia Jump to: navigation, search This article is about the baseball play. For double play magnetic tape, see audio tape specifications. For the jazz album, see Double Play!.

After stepping on second base, the fielder throws to first to complete a double play

In baseball, a double play (denoted on statistics sheets by DP) for a team or a fielder is the act of making two outs during the same continuous playing action. In baseball slang, making a double play is referred to as "turning two". Double plays are also known as "the pitcher's best friend" because they disrupt offense more than any other play, except for the rare triple play. Pitchers often select pitches that make a double play more likely (typically a pitch easily hit as a ground ball to a middle infielder) and teams on defense alter infield positions to make a ground ball more likely to be turned into a double play. Because a double play ends an inning in a one-out situation, it often makes the scoring of a run impossible in that inning. In a no-out situation with runners at first base and third base, the double play may be so desirable that the defensive team allows a runner to score from third base so that two outs are made and further scoring by the batting team is more difficult.

Contents

1Scoring of double plays 2Types of double plays o 2.1Common double plays o 2.2Rare double plays 3Strategy 4All-time single season double play leaders by position o 4.1First Base o 4.2Second Base o 4.3Shortstop o 4.4Third Base o 4.5Catcher

4.6Pitcher 4.7Left Field 4.8Center Field 4.9Right Field 5References

o o o o

Scoring of double plays


Double plays in which both outs are recorded by force plays or the batter-runner being put out at first base are referred to as "force double plays".[1] Double plays in which the first out is recorded via a force play or putting the batter-runner out at first base and the second out by tagging a runner who would have been forced out but for the first out (as when a first baseman fields a ground ball, steps on first base, and then throws to second) are referred to as "reverse force double plays". Should a run score on a play in which a batter hits into either a force double play or a reverse force double play, the official scorer may deny the batter credit for an RBI, although the batter always gets credit for an RBI on a one-out groundout or a fielder's choice play in which a baserunner scores. Records of double plays were not kept regularly until 1933 in the National League and 1939 in the American League. Double plays initiated by a batter hitting a ground ball (but not a fly ball or line drive) are recorded in the official statistic GIDP (grounded into double play).

Types of double plays


Common double plays

The most common type of double play occurs with a runner on first base and a ground ball hit towards the middle of the infield. The player fielding the ball (generally the shortstop or second baseman) throws to the fielder covering second base, who steps on the base before the runner from first arrives to force that runner out, and then throws the ball to the first baseman to put out the batter-runner for the second out. If the ball originated with the shortstop and was then thrown to the second baseman, the play is referred to as a "6-4-3 double play", after the numbers assigned to the players in order of field position; if it is hit to the second baseman and then thrown to the shortstop, it is known as a 4-6-3 double play (6-shortstop, 4-second base, 3-first base; see baseball positions). A slightly less common ground ball double play is the 5-4-3 double play, also called the "Around the Horn" double play which occurs on a ground ball hit to the third baseman (5), who throws to the second baseman (4) at second base, who then throws to the first baseman (3). Comparatively few third basemen succeed often at turning such double plays which require a third baseman with good range and a great throwing arm. Rarer still is a 4-3-6 or 6-3-4 double play in which a middle infielder throws first to the first baseman to retire the batter and the first baseman then throws to the other middle infielder who tags the runner from first base (the force situation having been removed when the batter was put out). Double plays also occur on ground balls hit to the pitcher. Most of the time, these double plays will go 1-6-3 (pitcher to shortstop to first baseman), though sometimes these double plays will go 1-4-3 (pitcher to second baseman to first baseman). 6-3 and 4-3 double plays occur on ground balls to the shortstop or second baseman, respectively, which the fielder

takes for an unassisted putout at second before throwing to first. The 3-6-3 double play occurs on a ground ball to the first baseman, who throws to the shortstop at second base before stepping on first. Thus, the shortstop can throw back to the first baseman, who is still able to get the putout at first. Variants of this double play include the 3-6-1 double play (where the pitcher covers first) and the 3-6-4 double play (where the second baseman covers first). Also, the first baseman may choose to retire the batter at first before throwing to the shortstop at second, who then tags the runner coming from first (tag because the force has been removed). More rare double plays include the 1-6-4-3, and the 1-4-6-3 double play. In these, the pitcher will "kick-save" the ball (instinctively knocking down the batted ball with his foot), or the ball will deflect off some other part of the pitcher's body. Another class of double plays include those in which infielders catch line drives and then throw or run to a base to catch a baserunner who fails to return to the base from which he has started. The batter is out because his ball has been caught on the fly, and a runner is out at another base. For example, if a batter hits a line drive to the second baseman (or any other infielder, or the pitcher) that a baserunner from first base thinks is a clean hit and the second baseman catches before it drops, then the second baseman can throw to first base to the fielder (usually the first baseman) covering the base; should the first baseman either touch first base with any part of his body (usually his feet) or tag the baserunner returning to first (not necessary), then a double play is completed. More rare is an unassisted double play in which the fielder catches a line drive and either tags a runner off base or tags a base that a baserunner cannot return to on time. A "strike 'em out, throw 'em out" double play requires that a base runner is caught stealing immediately after the batter strikes out. The batter is out on a called or swinging third strike while the runner is caught (typically for this play the shortstop). Such is a 2-6 double play unless a rundown ensues or the play is made at some base other than second base. The catcher gets a putout for the strikeout and an assist for the throw that leads to the caught stealing. On occasion, bad bunts can result in double plays. An attempted sacrifice bunt may be laid down such that a charging pitcher, first baseman or catcher (the typical initiators of such plays) is able to field the ball, throw to second base to force a runner out, and the shortstop (the usual fielder at second base on a bunt play) then is able to throw to the fielder covering first base (usually the second baseman) to put out the batter. With a runner on first base, should the batter bunt a ball fair as an infield fly, the infield fly rule that protects baserunners is no longer applicable. At his discretion, the fielder in position to catch the bunted fly ball may elect to 'trap' the fly ball (that is, put his glove on the ground but over the ball to secure it) or (a fielder is not allowed to drop a ball deliberately to force runners to advance) catch it on a short bounce, in which case the runner at first must reach second base before a throw is made to second base. The fielder covering second base can throw to first base to complete the double play. Should the runner at first stray too far from first base, however, and the infielder catches the pop fly, the infielder gets the out for catching an infield fly and throws to first base to complete the double play.
Rare double plays

Another double play occurs when a fly ball is hit to the outfield and caught, but a runner on the basepath strays too far away from his base. If the ball is thrown back to that base before the runner returns or tags up to go to the next base, the runner is out along with the batter for a double play.

A strike-'em-out-throw-'em-out double play at third base is rare, and an unassisted double play by the catcher on a play in which a baserunner tries to steal home base on a straight steal during a strikeout is highly unlikely. It is possible that a risky baserunning play will ensue during a strikeout in which a baserunner attempts to score from third base on a successful steal of second or a rundown play in which an infielder or pitcher throws to the catcher who then tags a runner trying to score, in which case the catcher gets two putouts (one for the strikeout and one for the tag play at home) and one assist for a throw to the infield. A catcher might also initiate a double play on a strikeout that begins with a dropped third strike and a throw to first to put out the batter-runner and a baserunner attempts to reach third or home on an attempted steal. Two others involve outfield flies: more commonly, a baserunner tags up from third base on an outfield fly, attempting to score before a throw from the outfielder (more rarely an infielder) can be thrown to the catcher. Should the catcher tag the runner before he can score, the play is considered a double play, and the outfielder is credited with an assist. Similar plays can be made at second base or third base, or in rundown plays on the infield. Many outfield assists are made on such plays, and the most assists made in any given year by a single outfielder is typically about twenty (they need not be made on double plays). Far rarer is a play in which the runner attempts to advance before the outfielder catches the fly ball. As a rule the double play is completed after the pitcher receives the ball and throws to the base that the runner has left too soon; on appeal the base-runner who left the base too early is called out on the play. A rare double play that can only take place with the bases loaded is a play in which a sharplyhit ball is fielded by an infielder, who throws to home to force the runner coming in from third. The catcher then throws the ball to the fielder covering first base to retire the batter. Such a double play ended the top half of the 8th inning during Game 7 of the 1991 World Series: With one out and the bases loaded, Atlanta'sSid Bream hit a ground ball at Twins first baseman Kent Hrbek, who fielded it and threw it to catcher Brian Harper to retire Lonnie Smith at home. Harper then threw back to Hrbek to retire the side. Another variation of this play, in which the pitcher, and not an infielder, first fields the ground ball is the "1-2-3 double play." Such a play occurred in the no-hitshutout that Jack Morris pitched in 1984.[2] A bizarre double play occurred in a nationally televised game between the New York Yankees and Chicago White Sox on August 2, 1985 when both Bobby Meacham and Dale Berra were tagged out at home by Carlton Fisk on a deep drive single to left-center by Rickey Henderson. An identical situation would occur again in the 2006 NLDS between the Dodgers and Mets when Russell Martin hit a single to right field and Paul Lo Duca tagged out Jeff Kent and J.D. Drew at the plate. An unusual double play occurred on April 12, 2008, Yankees at Red Sox. The infield was shifted right for Jason Giambi, with a baserunner on first. Giambi grounded to 2nd baseman Dustin Pedroia, who threw to the 3rd baseman Kevin Youkilis, who because of the shift actually had to cover 2nd base. Youkilis tagged second, then turned the DP by throwing to 1st baseman Sean Casey, to get Giambi out. This would therefore be a rare "4-5-3" double play. A similar double play occurred in the interleague game in Nippon Professional Baseball on June 14, 2009, where Hiroshima Toyo Carp against Saitama Seibu Lions, which Carp placed a five-man infield with left fielder covered the place between pitcher and 2nd base, and finally grounded into a rare "7-2-3" double play when the ball grounded to the shifted man.[3]

Another rare double play includes interference, where an offensive player hinders a defensive player's attempt at throwing the ball to make an out. Such a double play happened on July 24, 2000 in a game between the Anaheim Angels and the Texas Rangers. In the first inning, Mo Vaughn of the Angels struck out swinging, and the Ranger catcher Pudge Rodriguez attempted to throw out Kevin Stocker, who was trying to steal 2nd base. In his follow through, Rodriguez's hand hit Vaughn's bat, preventing him from making an accurate throw to 2nd base. Home plate umpire Gerry Davis called Stocker out at 2nd due to batter's interference. Scoring wise, the play went as a strike out for the pitcher Kenny Rogers, an unassisted double play for Rodriguez, and batter's interference on Vaughn. On May 27, 2011, the AA New Britain Rock Cats had a double play against the Binghamton Mets involving seven defenders. With runners on second and third, a ground ball was hit to the first baseman. The first basemen threw to the catcher to tag the runner from third. The catcher chased the runner back to third. By then, the runner from second was most of the way to third, so the catcher threw to the shortstop. Then the runner from third tried to go home again so the shortstop threw to the pitcher now covering home, who then threw to the third basemen who got the runner out. At this point, the batter was between first and second, so the third basemen threw to the first baseman to chase him down. He threw to the second basemen. Then the runner from second ran again, so the second baseman threw to the shortstop. The shortstop threw to the center fielder now covering second, who tagged the runner for the second out. The play was scored as a 3-2-6-1-5-3-4-6-8 double play.[4][5][6]

Strategy
This article's tone or style may not reflect the encyclopedic tone used on Wikipedia. See Wikipedia's guide to writing better articles for suggestions.(April 2008)

Highly desirable to the pitching team and highly undesirable to the batting team, the double play often proves critical to wins and losses of specific games. The pitching team is likely to change pitch selection and defensive alignment to make one of the more common double plays (the ones involving infield ground balls) more likely. Batting teams may adapt themselves to thwart or even exploit the situation. A so-called double-play position involves the second baseman and shortstop moving away from second base so that one of the fielders can field a ground ball and the other can run easily to second base to catch a ball thrown to him so that he can tag the base before the baserunner from first base can reach second base, the infielder tagging second base then throwing to first base to complete the double play. The pitcher tries to throw a pitch in the strike zone that, if hit, is likely to be grounded to an infielder (or the pitcher) and turned into a double play. In a situation with runners on second and third and fewer than two outs, a team may decide to give an intentional pass to a hitter, often a slow baserunner who is perceived as one of the more dangerous hitters on the team or to the pitcher. A double play is then possible on a ground ball to a middle infielder. However: (1) a subsequent walk scores a run, and

(2) the batter reaching first base on the intentional walk may score on subsequent plays should no outs be made. This situation allows a great reward to the pitching team should it succeed in inducing a double play (far less opportunity of scoring) but also great reward to the batting team should it fail. Batting teams can select lineups to reduce the likelihood of double plays by alternating slow right-handed hitters with left-handed hitters or hitters who are fast baserunners, or by putting a slow-running slugger (typically a catcher) in a low spot in the batting order (often #7 where there is no designated hitter). In a situation where a double play is possible, the batting team can

attempt to steal second base if it is unoccupied (but only with a fast baserunner) sacrifice bunt, which concedes an out but advances the baserunner and prevents a double play either avoid swinging at pitches likely to become infield ground outs or foul them off avoid pulling the ball (a ground ball "pulled" by a right-handed batter to the left side of the infield is a likely double-play ball) hit and run, a play in which the baserunner on first runs to second immediately after the pitch is thrown in the hope that the batter makes contact with the pitch try to hit the ball as a long fly ball, ideally a home run

All of these strategies entail risk and may be either inappropriate or impossible, depending on the situation. A stolen base attempt ensures that the runner on first base is either at second (making a double play impossible) or out (likewise, but with an out and the loss of a baserunner). Some batters cannot bunt well, and poor bunts can themselves result in double plays. Avoiding the double-play pitch may mean taking a called strike. Trying not to pull the ball decreases the possibility of a home run that scores two or more runs. The hit-and-run play requires that the batter hit the ball, lest the baserunner be caught stealing on a throw from the catcher to the shortstop or second baseman covering second base and makes a pick-off of a baserunner more likely. A strikeout-prone hitter who swings wildly in the hope of getting a pitch that he can hit as a long fly ball as a sacrifice fly, double, triple, or home run is more likely to strike out. Because the rarer double plays require baserunning errors, no team relies upon them to get out of a bad situation unless the opportunity arises. Even extreme strikeout pitchers such as Randy Johnson or Pedro Martnez sometimes have to rely on double plays to be effective. The ability to "make the pivot" on an infield double play, i.e. receive a throw from the thirdbase side, then turn and throw the ball to first in time to force-out the batsman, while avoiding being "taken out" by the runner, is considered to be a key skill for a second baseman. Cal Ripken, Jr. holds the major league record for most double plays grounded into in a career, with 350. He also holds the American League record for most double plays made by a shortstop. Both records are a consequence of his longevity as a player and the long grass at the Baltimore baseball stadium (Camden Yards and Memorial Stadium) as well as an accurate and strong throwing arm that allowed him to start more double plays than most other shortstops. As a batter, Ripken was a slow baserunner throughout his career, so he was less likely to reach base safely on a ground ball hit to the infield. A reasonably powerful right-

handed hitter who frequently hit near the middle of the batting order and did not strike out at a high rate, he frequently came to the plate with runners on base, and usually made solid contact (as opposed to bunting) to usually put the ball in play. More likely to hit the ball sharply to the left side of the infield, placed in the order of the lineup so that he usually had runners on base ahead of him, and less likely to beat throws to first base, and having a very long career because he was a good hitter for average and power, this competent hitter grounded into an unusual number of double plays. A batter who grounded into comparatively few double plays (72 in his long career)[2] was Kirk Gibson, a left-handed hitter and a fast runner who struck out often but largely hit fly balls and hit few ground balls. As a left-handed hitter, if he pulled the ball and put it on the ground, he usually pulled the ball to the right side of the infield. To complete a double play on a ground ball that he did hit, a team had to complete the usually-difficult 3-6-3 or 3-6-1 double play; Gibson would usually reach first base before the double play could be completed in part because the 3-6-3 and 3-6-1 double plays take two long throws and in part because as a left-handed hitter he had a slightly-shorter run to first base. Like Ripken he was a power hitter usually batting in the middle of the batting order and often with runners on first base; unlike Ripken he hit far fewer balls toward fielders who could turn double plays upon him and struck out far more often, his strikeouts making a GIDP impossible.

Gross Production Average


From Wikipedia, the free encyclopedia Jump to: navigation, search

Gross Production Average or GPA is a baseball statistic created in 2003 by Aaron Gleeman,[1] as a refinement of On-Base Plus Slugging (OPS).[2][3] GPA attempts to solve two frequently cited problems with OPS. First, OPS gives equal weight to its two components, On Base Percentage (OBP) and Slugging Percentage (SLG). In fact, OBP contributes significantly more to scoring runs than SLG does. Sabermetricians have calculated that OBP is about 80% more valuable than SLG.[4][5] A second problem with OPS is that it generates numbers on a scale unfamiliar to most baseball fans. For all the problems with a traditional stat like batting average (AVG), baseball fans immediately know that a player batting .365 is significantly better than average, while a player batting .167 is significantly below average. But many fans don't immediately know how good a player with a 1.013 OPS is.

The basic formula for GPA is:[6] Unlike OPS, this formula both gives proper relative weight to its two component statistics and generates a number that falls on a scale similar to the familiar batting average scale.[7]

On-base percentage
From Wikipedia, the free encyclopedia (Redirected from On base percentage) Jump to: navigation, search

In baseball statistics, on-base percentage (OBP; sometimes referred to as on-base average/OBA, as the statistic is rarely presented as a true percentage) is a measure of how often a batter reaches base for any reason other than a fielding error, fielder's choice, dropped/uncaught third strike, fielder's obstruction, or catcher's interference (the latter two are ignored as either times-on-base (TOB) or plate appearances in calculating OBP). OBP is added to slugging average to determine on-base plus slugging (OPS). It first became an official MLB statistic in 1984.

Contents

1Overview o 1.1All-time leaders o 1.2Single-season leaders 2See also 3Notes

Overview
Traditionally, players with the best on-base percentages bat as leadoff hitter. The league average for on-base percentage has varied considerably over time; in the modern era it is around .340, whereas it was typically only .300 in the dead-ball era. On-base percentage can also vary quite considerably from player to player. The record for the highest career OBP by a hitter, based on over 3000 plate appearances, is .482 by Ted Williams. The lowest is by Bill Bergen, who had an OBP of .194. For small numbers of at-bats, it is possible (though unlikely) for a player's on-base percentage to be lower than his batting average (H/AB). This happens when a player has almost no walks or times hit by pitch, with a higher number of sacrifice flies (e.g. if a player has 2 hits in 6 at bats plus a sacrifice fly, his batting average would be .333, but his on-base percentage would be .286). The player who experienced this phenomenon with the most number of at-bats over a full season was Ernie Bowman, who over 125 at-bats in 1963 had a batting average of .184 and an on-base percentage of .181. On-base percentage is calculated using this formula:

where

H = Hits BB = Bases on Balls (Walks) HBP = Hit By Pitch AB = At bats SF = Sacrifice Flies

NOTE: Sacrifice flies were not counted as an official statistic until 1954. Before that time, all sacrifices were counted as sacrifice hits (SH), which included both sacrifice flies and bunts.

Sacrifice bunts (sacrifice hits since 1954), which would lower a batter's on-base percentage, are not included in the calculation for on-base percentage, as bunting is an offensive strategy often dictated by the manager the use of which does not necessarily reflect on the batter's ability and should not be used to penalize him. For calculations of OBP before 1954, or where sacrifice flies are not explicitly listed, the number of sacrifice flies should be assumed to be zero.

On-base plus slugging


From Wikipedia, the free encyclopedia Jump to: navigation, search

Carlos Beltrn holds the highest OPS in the history of the Major League Baseball postseason.

On-base plus slugging (OPS) is a sabermetricbaseball statistic calculated as the sum of a player's on-base percentage and slugging average.[1] The ability of a player to both get on base and to hit for power, two important hitting skills, are represented. An OPS of .900 or higher in Major League Baseball puts the player in the upper echelon of hitters. Typically, the league leader in OPS will score near, and sometimes above, the 1.000 mark.

Contents

1Formula 2Interpretation of OPS 3An OPS scale 4History 5Leaders 6Adjusted OPS (OPS+) o 6.1Leaders in OPS+ 7See also 8Notes 9References

Formula
The basic formula is

where OBP is on-base percentage and SLG is slugging average. These averages are defined

and

where:

H = Hits BB = Base on balls HBP = Times hit by pitch AB = At bats SF = Sacrifice flies TB = Total bases

In one formula, OPS can be represented as:

Interpretation of OPS
OPS does not present a complete picture of a player's offensive contributions. Factors such as baserunning, basestealing, and the leverage/timeliness of performance are not considered. More expansive sabermetric measurements do attempt to incorporate some or all of the abovementioned factors. Nonetheless, even though it does not include them, OPS correlates quite well with team run scoring. Other sabermetric stats, such as runs created and Wins Above Replacement, attempt to express a player's contribution directly in terms of runs and/or wins. However, a player's OPS does not have a simple intrinsic meaning. OPS weighs on-base percentage and slugging average equally. However, on-base percentage correlates better with scoring runs.[2] Statistics such as wOBA build on this distinction using linear weights, avoiding OPS' flaws. Magnifying this fault is that the numerical parts of OPS are not themselves typically equal (league-average slugging percentages are usually 75-100

points higher than league-average on-base percentages). As a point of reference, the OPS for all of Major League Baseball in 2008 was .749.[3] OPS has the advantage of being based on two already well-established stats whose legitimacy within baseball is uncontroversial. And if those stats are known, it is extremely easy to calculate, as one need only add the two numbers together.

An OPS scale
Bill James, in his essay titled "The 96 Families of Hitters"[4] uses seven different categories for classification by OPS:
Category Classification A B C D E F G Great Very Good OPS Range .9000 and Higher .8333 to .8999

Above Average .7667 to .8333 Average .7000 to .7666

Below Average .6334 to .6999 Poor Atrocious .5667 to .6333 .5666 and Lower

This effectively transforms OPS into a 7 point Likert Scale. Substituting typical Likert scale quality values such as Excellent (A), Very Good (B), Good (C), Average (D), Fair (E), Poor (F) and Very Poor (G) for the A-G categories creates a subjective reference for OPS values.

History
On-base plus slugging was first popularized in 1984 by John Thorn and Pete Palmer's book, The Hidden Game of Baseball.[5]The New York Times then began carrying the leaders in this statistic in its weekly "By the Numbers" box, a feature that continued for four years. Baseball journalist Peter Gammons used and evangelized the statistic, and other writers and broadcasters picked it up. The popularity of OPS gradually spread, and by 2004 it began appearing on Topps baseball cards.[6] OPS was formerly sometimes known as "Production", for instance in early versions of Thorn's Total Baseball encyclopedia, and in the Strat-O-Matic computer baseball game. This term has fallen out of use.

Leaders

The Top 10 Major League Baseball players in lifetime OPS, with at least 3,000 plate appearances through May 28, 2012 are (active players in bold):
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Babe Ruth, 1.1638 Ted Williams, 1.1155 Lou Gehrig, 1.0798 Barry Bonds, 1.0512 Jimmie Foxx, 1.0376 Albert Pujols, 1.0264 Hank Greenberg, 1.0169 Rogers Hornsby, 1.0103 Manny Ramrez, 0.9970 Mark McGwire, 0.9823

The top four were all left-handed batters. Jimmie Foxx has the highest career OPS for a righthanded batter. Source: Baseball-Reference.com - Career Leaders & Records for OPS The Top 10 single-season performances in MLB are (all left-handed hitters):
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Barry Bonds, 1.4217 (2004) Babe Ruth, 1.3818 (1920) Barry Bonds, 1.3807 (2002) Barry Bonds, 1.3785 (2001) Babe Ruth, 1.3586 (1921) Babe Ruth, 1.3089 (1923) Ted Williams, 1.2875 (1941) Barry Bonds, 1.2778 (2003) Babe Ruth, 1.2582 (1927) Ted Williams, 1.2566 (1957)

The highest single-season mark for a right-handed hitter was 1.2449 by Rogers Hornsby in (1925), (13th on the all-time list). Since 1925, the highest single-season OPS for a righthander is 1.2224 by Mark McGwire in (1998), which is good for 16th all-time. Source: Baseball-Reference.com - Single-Season Records for OPS

Adjusted OPS (OPS+)


OPS+, Adjusted OPS, is a closely related statistic. OPS+ is OPS adjusted for the park and the league in which the player played, but not for fielding position. An OPS+ of 100 is defined to be the league average. An OPS+ of 150 or more is excellent and 125 very good, while an OPS+ of 75 or below is poor. The basic formula for OPS+ is

where *lgOBP is the park adjusted OBP of the league (not counting pitchers hitting) and *lgSLG is the park adjusted SLG of the league. A common misconception is that OPS+ closely matches the ratio of a player's OPS to that of the league. In fact, due to the additive nature of the two components in OPS+, a player with an OBP and SLG both 50% better than league average in those metrics will have an OPS+ of 200 (twice the league average OPS+) while still having an OPS that is only 50% better than the average OPS of the league. It would be a better (although not exact) approximation to say that a player with an OPS+ of 150 produces 50% more runs, in a given set of plate appearances, as a player with an OPS+ of 100.
Leaders in OPS+

Through May 28, 2012, the career leaders in OPS+ (minimum 3,000 plate appearances, active players in bold) were 1. Babe Ruth, 206 2. Ted Williams, 190 3. Barry Bonds, 181 4. Lou Gehrig, 178 5. Rogers Hornsby, 175 6. Mickey Mantle, 172 7. Dan Brouthers, 170 8. Joe Jackson, 169 9. Ty Cobb, 168 9. Albert Pujols, 168 11. Pete Browning, 163 12. Jimmie Foxx, 163 Source: Baseball-Reference.com - Career Leaders & Records for Adjusted OPS+. The only purely right-handed batters to appear on this list are Hornsby, Pujols, and Foxx. Mantle is the only switch-hitter in the group. The highest single-season performances were:
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Barry Bonds, 268 (2002) Barry Bonds, 263 (2004) Barry Bonds, 259 (2001) Fred Dunlap, 258 (1884) * Babe Ruth, 256 (1920) Babe Ruth, 239 (1921) Babe Ruth, 239 (1923) Ted Williams, 235 (1941) Ted Williams, 233 (1957) Ross Barnes, 231 (1876) ** Barry Bonds, 231 (2003)

Source: Baseball-Reference.com - Single-Season Leaders & Records for Adjusted OPS+

* - Fred Dunlap's historic 1884 season came in the Union Association, which some baseball experts consider not to be a true major league ** - Ross Barnes may have been aided by a rule that made a bunt fair if it first rolled in fair territory. He did not play nearly so well when this rule was removed, although injuries may have been mostly to blame, as his fielding statistics similarly declined. If Dunlap's and Barnes' seasons were to be eliminated from the list, two other Ruth seasons (1926 and 1927) would be on the list. This would also eliminate the only right-handed batter in the list, Barnes.

Plate appearance
From Wikipedia, the free encyclopedia Jump to: navigation, search This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.(May 2012)

Jimmy Rollins holds the single season record for most plate appearances, at 778.

In baseball statistics, a player is credited with a plate appearance (denoted by PA) each time he completes a turn batting. A player completes a turn batting when: he strikes out or is declared out before reaching first base; or he reaches first base safely or is awarded first base (by a base on balls, hit by pitch, or catcher's interference); or he hits a fair ball which causes a preceding runner to be put out for the third out before he himself is put out or reaches first base safely (see alsoleft on base, fielder's choice, force play)

Contents

1Calculating 2Other uses 3Scoring 4Major League Baseball leaders o 4.1Career o 4.2Season 5References

Calculating
A batter is not charged with a plate appearance if, while he was at bat, a preceding runner is put out on the basepaths for the third out in a way other than by the batter putting the ball into play (i.e., picked off, caught stealing). In this case, the same batter continues his turn batting in the next inning with no balls or strikes against him. A batter is not charged with a plate appearance if, while he was at bat, the game ends as the winning run scores from third base on a balk. A batter may or may not be charged with a plate appearance (and possibly at-bat) in the rare instance when he is replaced by a pinch hitter after having already started his turn at bat. In this case, the pinch hitter would receive the plate appearance (and potential of an at-bat) unless the original batter is replaced when having 2 strikes against him and the pinch hitter subsequently completes the strikeout. In this case the plate appearance and at-bat are charged to the first batter. (see rule 10.15b) PA = AB + BB + HBP + SH + SF + Times Reached on Defensive Interference Basically, "plate appearances" = at bats + some of the scenarios excluded from at bats such as base on balls, hit by pitch, sacrifice or catcher's interference which positively affect the offensive team.

Other uses
In common terminology, the term "at bat" is sometimes used to mean "plate appearance" (for example, "he fouled off the ball to keep the at bat alive"). The intent is usually clear from the context, although the term "official at bat" is sometimes used to explicitly refer to an at bat as distinguished from a plate appearance. However, terms such as turn at bat or time at bat are synonymous with plate appearance.

Scoring
Section 10 of the official rules states that an at bat is not counted when the player:[1]
1. hits a sacrifice bunt or sacrifice fly 2. is awarded first base on four called balls

3. is hit by a pitched ball 4. is awarded first base because of interference or obstruction

The main use of the plate appearance statistic is in determining a player's eligibility for leadership in some offensive statistical categories, notably batting average; currently, a player must have 3.1 PAs per game scheduled to qualify for the batting title (for the 162-game schedule, that means 502 PAs).[1] Also, it is often erroneously cited that total plate appearances is the divisor (i.e., denominator) used in calculating on base percentage (OBP), an alternative measurement of a player's offensive performance; in reality, the OBP denominator does not include certain PAs, such as times reached via either catchers interference or fielders obstruction or Sacrifice Hits (Sacrifice Flies are included). Plate appearances are also used by scorers for "proving" a box score. If the game has been scored correctly, the total number of plate appearances for a team should equal the total of that team's runs, men left on base, and men put out.

Plate appearances per strikeout


From Wikipedia, the free encyclopedia Jump to: navigation, search

In baseball statistics, plate appearances per strikeout (PA/SO) represents a ratio of the number of times a batterstrikes out to their plate appearance. This statistic allows a defensive team to examine the opposing team's lineup for hitters who are more prone to strikeout. Such players, when batting, are typically more aggressive than the average hitter. This knowledge permits the pitcher to approach the batter with more pitching options, often throwing more balls out of the strike zone in the hope that the batter will swing and miss. The number of this statistic can be calculated by dividing a player's total number of plate appearances by their total number of strikeouts. For example, Reggie Jackson collected 2,597 strikeouts and 11,418 plate appearances in his 21-year baseball career, recording a 4.39 PA/SO, which suggests that for every 4.39 plate appearance Jackson had one strikeout.

Runs created
From Wikipedia, the free encyclopedia Jump to: navigation, search

Runs created (RC) is a baseball statistic invented by Bill James to estimate the number of runs a hitter contributes to his team.

Contents

1Purpose 2Formula

2.1Basic runs created 2.2"Stolen base" version of runs created 2.3"Technical" version of runs created 2.42002 version of runs created 2.5Other expressions of runs created 3Accuracy 4Related statistics 5See also 6External links 7References

o o o o o

Purpose
James explains in his book, The Bill James Historical Baseball Abstract, why he believes runs created is an essential thing to measure: With regard to an offensive player, the first key question is how many runs have resulted from what he has done with the bat and on the basepaths. Willie McCovey hit .270 in his career, with 353 doubles, 46 triples, 521 home runs and 1,345 walks -- but his job was not to hit doubles, nor to hit singles, nor to hit triples, nor to draw walks or even hit home runs, but rather to put runs on the scoreboard. How many runs resulted from all of these things?[1] Runs created attempts to answer this bedrock question. The conceptual framework of the "runs created" stat is:

where

A = On-base factor B = Advancement factor C = Opportunity factor

Formula
Basic runs created

In the most basic runs created formula:

where H is hits, BB is base on balls, TB is total bases and AB is at-bats. This can also be expressed as

OBP SLG AB or, OBP TB

where OBP is on base percentage, SLG is slugging average, AB is at-bats and TB is total bases.
"Stolen base" version of runs created

This formula expands on the basic formula by accounting for a player's basestealing ability.

where H is hits, BB is base on balls, CS is caught stealing, TB is total bases, SB is stolen bases, and AB is at bats.
"Technical" version of runs created

This formula accounts for all basic, easily available offensive statistics.

where H is hits, BB is base on balls, CS is caught stealing, HBP is hit by pitch, GIDP is grounded into double play, TB is total bases, IBB is intentional base on balls, SH is sacrifice hit, SF is sacrifice fly, and AB is at bats.
2002 version of runs created

Earlier versions of runs created overestimated the number of runs created by players with extremely high A and B factors (on-base and slugging), such as Babe Ruth, Ted Williams and Barry Bonds. This is because these formulas placed a player in an offensive context of players equal to himself; it is as if the player is assumed to be on base for himself when he hits home runs. Of course, this is impossible, and in reality, a great player is interacting with offensive players whose contributions are inferior to his. The 2002 version corrects this by placing the player in the context of his real-life team. This 2002 version also takes into account performance in "clutch" situations.
A: B:

C:

where K is strikeout.

The initial individual runs created estimate is then:

If situational hitting information is available, the following should be added to the above total:

where RISP is runners in scoring position, BA is batting average, HR is home run, and ROB is runners on base. The subscripts indicate the required condition for the formula. For example, means "hits while runners are in scoring position." This is then figured for every member of the team, and an estimate of total team runs scored is added up. The actual total of team runs scored is then divided by the estimated total team runs scored, yielding a ratio of real to estimated team runs scored. The above individual runs created estimate is then multiplied by this ratio, to yield a runs created estimate for the individual.[2]
Other expressions of runs created

The same information provided by runs created can be expressed as a rate stat, rather than a raw number of runs contributed. This is usually expressed as runs created per some number of outs, e.g. (27 of course being the number of outs per team in a standard 9-inning baseball game).

Accuracy
Runs created is believed to be an accurate measure of an individual's offensive contribution because when used on whole teams, the formula normally closely approximates how many runs the team actually scores. Even the basic version of runs created usually predicts a team's run total within a 5% margin of error.[3] Other, more advanced versions are even more accurate.

Runs produced
From Wikipedia, the free encyclopedia Jump to: navigation, search

Runs produced is a baseball statistic that can help estimate the number of runs a hitter contributes to his team. The formula adds together the player's runs and Run batted in, and then subtracts the player's home runs.[1]

Home runs are subtracted to compensate for the batter getting credit for both one run and at least one RBI when hitting a home run. Unlike runs created, runs produced is a teammate-dependent stat in that it includes Runs and RBIs, which are affected by which batters bat near a player in the batting order. Also, subtracting home runs seems logical from an individual perspective, but on a team level it double-counts runs that are not home runs. To counteract the double-counting, some have suggested an alternate formula which is the average of a player's runs scored and runs batted in.

Here, when a player scores a run, he shares the credit with the batter who drove him in, so both are credited with half a run produced. The same is true for an RBI, where credit is shared between the batter and runner. In the case of a home run, the batter is responsible for both the run scored and the RBI, so the runs produced are (1 + 1)/2 = 1, as expected.

Total average
From Wikipedia, the free encyclopedia Jump to: navigation, search This article does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.(December 2009)

Total average is a baseball statistic devised by sportswriter Thomas Boswell in the 1970s. The statistic is designed to measure a hitter's overall offensive contributions. The definition of the statistic is simple. A player gets a credit for every base he accumulates and a penalty for every out he makes. So a player gets one credit for a single, walk, stolen base or being hit by a pitch; two for a double; three for a triple; and four for a home run. A player's Total Average is calculated by adding all the bases together and dividing them by the number of outs the player makes. The formula is:

where

TA = Total average TB = Total bases HBP = Hit by pitch BB = Walks SB = Stolen base

CS = Caught stealing AB = At bats H = Hits GIDP = grounded into double play

Because Total average emphasizes walks and extra base hits - and de-emphasizes singles - it has much in common with statistics developed by Bill James and other sabermetricians. Like OPS, total average gives credit to players who draw a lot of walks and hit with a lot of power: Babe Ruth, Barry Bonds, Ted Williams and Frank Thomas for instance. James himself was critical of total average.[citation needed]

Extrapolated Runs
From Wikipedia, the free encyclopedia Jump to: navigation, search

Extrapolated Runs (XR) is a baseball statistic invented by sabermetricianJim Furtado to estimate the number of runs a hitter contributes to his team. XR measures essentially the same thing as Bill James' Runs Created, but it is a linear weights formula that assigns a run value to each event, rather than a multiplicative formula like James' creation.

Contents

1Purpose and formulae 2Pros and cons of extrapolated runs 3See also 4External links

Purpose and formulae


According to Furtado, Extrapolated Runs was inspired by Paul Johnson's Estimated Runs Produced (ERP) formula, which was published in James' 1985 Baseball Abstract. Furtado found that Johnson's method, when written a different way, was essentially a linear weights formula (something James apparently did not recognize at the time, given his very public disdain for linear run estimators). ERP was almost as accurate as RC at measuring team runs, it did not succumb to RC's infamous problems at the individual level, and its values stacked up well when compared to Pete Palmer's linear weights formula, even though the two methods were developed in entirely different ways. For these reasons, Furtado believed that linear estimators had more promise than was originally thought, and he set out to develop his own. After much trial and error (some of which involved borrowing concepts and weights from other linear formulas), Furtado eventually found a set of weights that best fit his sample (every Major League Baseball season from 19551997). He unveiled the formula in the 1999 Big Bad Baseball Annual:

"Extrapolated Runs was developed for use with seasons from 1955 to the present. I came up with three versions of the formula. The three formulas are:

XR Extrapolated Runs = (.50 1B) + (.72 2B) + (1.04 3B) + (1.44 HR) + (.34 (HP+TBBIBB)) + (.25 IBB)+ (.18 SB) + (.32 CS) + (.090 (AB H K)) + (.098 x K)+ (.37 GIDP) + (.37 x SF) + (.04 SH) XRR Extrapolated Runs Reduced = (.50 1B) + (.72 2B) + (1.04 3B) + (1.44 HR) + (.33 (HP+TBB)) + (.18 SB) + (.32 CS) + ((.098 (AB H)) XRB Extrapolated Runs Basic = (.50 1B) + (.72 2B) + (1.04 3B) + (1.44 HR) + (.34 (TBB)) + (.18 SB) + (.32 CS) + (.096 (AB H))

"As you can see, calculating XR requires only addition and multiplication. Its simplicity of design is one of its greatest attributes. Unlike a lot of the other methods, you don't need to know team totals, actual runs, league figures or anything else. You just plug the stats into the formula and you are all set. "Another of XR attributes is that the formula is pretty much context neutral. Other than park effects, the only remaining residue of context is due to the inclusion of IBB, GIDP and SF. Although I could have removed them from the full version, I felt that the inclusion of these terms was important since my research showed there was a strong correlation between the IBB, SF and GIDP opportunities that players face. I also felt, like Bill James, that these statistics do tell us something valuable about players. Of course, I knew some people might not agree with me. For them, I created two other versions. "XR also accounts for just about every out. James correctly understands that the more outs an individual player consumes the less valuable his positive contributions are. Since XR will be used as the base of the Extrapolated Win method, I thought it was important to include as many outs as possible in the formula. "Another nice thing about XR is that if you add up all the players' Extrapolated Runs, you'll have the team totals. That's a benefit of using a linear equation."

Pros and cons of extrapolated runs


Along with Palmer's Linear Weights, XR is the most accurate of the linear run estimators, in terms of predicting team runs scored. And unlike James' RC, it doesn't artificially inflate the runs produced by individual players who combine high OBPs and SLGs. It is also much easier to calculate than Base Runs. However, like any linear formula, there is no guarantee that it will work outside of the context in which it was developed (in this case, seasons from 19551997).

Component ERA
From Wikipedia, the free encyclopedia Jump to: navigation, search

This article does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.(December 2009)

Component ERA or ERC is a baseball statistic invented by Bill James.[citation needed] It attempts to forecast a pitcher's earned run average (ERA) from the number of hits and walks allowed rather than the standard formula of average number of earned runs per nine innings. ERC allows one to take a fresh look at a pitcher's performance and gauge if his results are more or less than the sum of its parts. The formula for ERC as it appears in the 2004 edition of the Bill James Handbook:
ERC = (((H + BB + HBP)PTB)/(BFPIP))9 0.56

where H is hits, BB is bases on balls (walks), HBP is hit by pitch, BFP is batters faced by pitcher, IP is innings pitched, and PTB is defined as:
PTB = 0.89(1.255(H HR) + 4HR) + 0.56(BB + HBP IBB)

where HR is home runs, IBB is intentional walks, and others are as above. The point of the first component is to represent the number of baserunners allowed. The PTB component combines an estimate of extra bases allowed (first half) with the fact that walks and hit by pitches do not advance unforced baserunners (second half). The division places the computation into an ERA context, and the final subtraction moves that scale down into its normal range. Where intentional walk data are not available use:
PTB = 0.89(1.255(H HR) + 4HR) + 0.475(BB + HBP)

If ERC is less than 2.24, the formula is adjusted as follows:


ERC = ((((H + BB + HBP)PTB)/(BFPIP))9)*0.75

Other people and organizations have their own proprietary formulae for ERC which may correlate more highly with actual earned runs scored than the formula above.[citation needed] Component ERA was added to the ESPN.com "Sortable Stats" in 2004.

See also
Defense-Independent Component ERA
From Wikipedia, the free encyclopedia Jump to: navigation, search

Abbreviated 'DICE', Defense-Independent Component ERA is a recent (21st century) variation on Component ERA, one of an increasing number of baseballsabermetrics that fall under the umbrella of defense independent pitching statistics. DICE [1] was created by Clay Dreslough in 2001. The formula for Defense-Independent Component ERA (DICE) is:

In that equation, "HR" is home runs, "BB" is walks, "HBP" is hit batters, "K" is strikeouts, and "IP" is innings pitched. That equation gives a number that is better at predicting a pitcher's ERA in the following year than the pitcher's actual ERA in the current year. Component ERA was created by Bill James to create a more accurate way of evaluating pitchers than earned run average (ERA). Whereas ERA is significantly affected by luck (such as whether the component hits are allowed consecutively), Component ERA eliminates this factor and assigns a weight to each of the recorded 'components' of a pitchers performance. For CERA, these are singles, doubles, triples, home runs, walks and hit batters. Defense-Independent Component ERA (aka 'DICE') is an improvement on CERA that removes the contribution of the pitcher's defense, instead estimate a pitcher's ERA from the components of his pitching record that don't involve defense. These are home runs, walks, hit batters and strikeouts.

Equivalent average
From Wikipedia, the free encyclopedia Jump to: navigation, search

Equivalent Average (EqA) is a baseball metric invented by Clay Davenport and intended to express the production of hitters in a context independent of park and league effects.[1] It represents a hitter's productivity using the same scale as batting average. Thus, a hitter with an EqA over .300 is a very good hitter, while a hitter with an EqA of .220 or below is poor. An EqA of .260 is defined as league average. The date EqA was invented cannot readily be documented, but references to it were being offered on the rec.sport.baseball usenet group as early as January 14, 1996.[2]

Contents

1Definition and rationale

2Renaming 3Notes 4See also 5External links

Definition and rationale


In the formula given in the box above, the abbreviations are: H=Hit, TB=Total bases, BB=Bases on balls (walks), HBP=Hit by pitch, SB=Stolen base, SH=Sacrifice hit (typically, sacrifice bunt), SF=Sacrifice fly, AB=At bat, CS=Caught stealing. EqA is one of several sabermetric approaches which validated the notion that minor league hitting statistics can be useful measurements of Major League ability. It does this by adjusting a player's raw statistics for park and league effects. For instance, the Pacific Coast League is a minor league known to be a very friendly venue for hitters. Therefore, a hitter in the PCL may have notably depressed raw statistics (a lower batting average, fewer home runs, etc.) if he were hitting in another league at the same level. Additionally, in general the level of competition at the PCL is lower than that in the Majors, so a hitter in the PCL would likely have lesser raw statistics in the Majors. EqA is thus useful to strip certain illusions from the surface of players' raw statistics. EqA is a derivative of Raw EqA, or REqA. REqA is (H + TB + 1.5*(BB + HBP + SB) + SH + SF) divided by (AB + BB + HBP + SH + SF + CS + SB). REqA in turn is adjusted to account for league difficulty and scale to create EqA. EqA has been used for several years by the authors of the Baseball Prospectus. It is also one of the statistics predicted for each hitter in Baseball Prospectus's annual PECOTA forecasts. EqA is scaled like a batting average, which is the number of safe hits divided by the number of plate appearances. However, Davenport EqA aims to capture not so much hits per at bat but instead "runs produced per at bat".[3] In that sense, EqA is akin to a larger family of run estimators that sabermetricians use.

Renaming
In 2010 Baseball Prospectus renamed EqA "True Average" (abbreviated TAv).[4] The rationale is that "the new name underscores our ability to get a 'True-r' grasp on the quality of a hitter than the aforementioned traditional or more modern stats do. Quite frankly, we're hopeful that this simple, easy-to-remember name can reach a wider audience."

Adjusted ERA+
From Wikipedia, the free encyclopedia Jump to: navigation, search

Adjusted ERA+, often simply abbreviated to ERA+ or ERA plus, is a pitching statistic in baseball. It adjusts a pitcher's earned run average (ERA) according to the pitcher's ballpark (in case the ballpark favors batters or pitchers) and the ERA of the pitcher's league. Average ERA+ is set to be 100; a score above 100 indicates that the pitcher performed better than average, below 100 indicates worse than average. For instance, if the average ERA in the league is 4.00, and the pitcher is pitching in a ballpark that favors hitters, and his ERA is 4.00, then his ERA+ will be over 100. Likewise, if the average ERA in the league is 3.00, and the pitcher is pitching in a ballpark favoring pitchers, and the pitcher's ERA is 3.00, then the pitcher's ERA+ will be below 100. As a result, ERA+ can be used to compare pitchers across different run environments. In the above example, the first pitcher may have performed better than the second pitcher, even though his ERA is higher. ERA+ can be used to account for this misleading impression. Pedro Martnez holds the modern record for highest ERA+ in a single season; he posted a 1.74 ERA in the 2000American League, which had an average ERA of 4.92, which gave Martinez an ERA+ of 291.[1] While Bob Gibson has the lowest ERA in modern times (1.12 in the National League in 1968), the average ERA was 2.99 that year (the so-called Year of the Pitcher) and so Gibson's ERA+ is 258, sixth highest since 1900. 1968 was the last year that Major League Baseball employed the use of a pitcher's mound greater than 10 inches.[2] The career record for ERA+ (with a minimum of 1,000 innings pitched) is held by Mariano Rivera, a closer who has a career ERA+ of 206. The career record ERA+ amongst retired players is 154, held by Pedro Martinez, with Jim Devlin, a pitcher in the 1870s, next at 151.[3] Pedro Martnez has the most separate seasons with an ERA+ over 200, with five, and the most consecutive 200 ERA+ seasons (4). Roger Clemens topped a 200 ERA+ three times, and Greg Maddux had two such seasons.

Defense independent pitching statistics


From Wikipedia, the free encyclopedia (Redirected from Fielding independent pitching) Jump to: navigation, search

In baseball, defense-independent pitching statistics (DIPS) measure a pitcher's effectiveness based only on statistics that do not involve fielders (except the catcher). These include home runs allowed, strikeouts, hit batters, walks, and, more recently, fly ball percentage, ground ball percentage, and (to much a lesser extent) line drive percentage. By focusing on these statistics, which the pitcher has almost total control over, and ignoring what happens once a ball is put in play, which the pitcher has little control over, DIPS can offer a clearer picture of the pitcher's true ability. Originally, the most controversial part of DIPS was the idea that pitchers have little influence over what happens to balls that are put into play. But this has since been well established (see below), primarily by showing the large variability of most pitchers' BABIP from year to year. In fact, the outcome of balls in play is dictated largely by the quality and/or arrangement of the defense behind the pitcher, and by a good deal of luck. For example, an outfielder may

make an exceptionally strong diving catch to prevent a hit, or a base runner may beat a play to a base on a ball thrown from a fielder with sub-par arm strength.

Contents

1Origin of DIPS 2Controversy and acceptance 3Alternate formulae o 3.1DICE o 3.2FIP o 3.3xFIP 4See also 5References

Origin of DIPS
In 1999, Voros McCracken became the first to detail and publicize these effects to the baseball research community when he wrote on rec.sport.baseball, "I've been working on a pitching evaluation tool and thought I'd post it here to get some feedback. I call it 'Defensive Independent Pitching' and what it does is evaluate a pitcher base[d] strictly on the statistics his defense has no ability to affect. . ." .[1] Until the publication of a more widely read article in 2001, however, on Baseball Prospectus, most of the baseball research community believed that individual pitchers had an inherent ability to prevent hits on balls in play.[2] McCracken reasoned that if this ability existed, it would be noticeable in a pitcher's 'Batting Average on Balls In Play' (BABIP). His research found the opposite to be true: that while a pitcher's ability to cause strikeouts or allow home runs remained somewhat constant from season to season, his ability to prevent hits of balls in play did not. To better evaluate pitchers in light of his theory, McCracken developed "Defense-Independent ERA" (dERA), the most well-known defense-independent pitching statistic. McCracken's formula for dERA is very complicated, with a number of steps.[3] DIPS ERA is not as useful for knuckleballers and other "trick" pitchers, a factor that McCracken mentioned a few days after his original announcement of his research findings in 1999, in a posting on the rec.sport.baseball.analysis Usenet site on November 23, 1999, when he wrote: "Also to [note] is that, anecdotally, I believe pitchers with trick deliveries (e.g. Knuckleballers) might post consistently lower $H numbers than other pitchers. I looked at Tim Wakefield's career and that seems to bear out slightly".[4] In later postings on the rec.sport.baseball site during 1999 and 2000 (prior to the publication of his widely-read article on BaseballProspectus.com in 2001), McCracken also discussed other pitcher characteristics that might influence BABIP.[5] In 2002 McCracken created and published version 2.0 of dERA, which incorporates the ability of knuckleballers and other types of pitchers to affect the number of hits allowed on balls hit in the field of play (BHFP).[6][7]

Controversy and acceptance

Controversy over DIPS was heightened when Tom Tippett at Diamond Mind published his own findings in 2003. Tippett concluded that the differences between pitchers in preventing hits on balls in play were at least partially the result of the pitcher's skill.[8] Tippett analyzed certain groups of pitchers that appear to be able to reduce the number of hits allowed on balls hit into the field of play (BHFP). Like McCracken, Tippett found that pitchers' BABIP was more volatile on an annual basis than the rates at which they gave up home runs or walks. It was this greater volatility that had led McCracken to conclude pitchers had "little or no control" over hits on balls in play. But Tippett also found large and significant differences between pitchers' career BABIP. In many cases, it was these differences that accounted for the pitchers' relative success. However, improvements to DIPS that look at more nuanced defense-independent stats than strikeouts, home runs, and walks (such as groundball rate), have been able to account for many of the BABIP differences that Tippet identified without reintroducing the noise from defense variability.[9] Despite other criticisms, the work by McCracken on DIPS is regarded by many in the sabermetric community as the most important piece of baseball research in many years. As Jonah Keri wrote in 2012, "When Voros McCracken wrote his seminal piece on pitching and defense 11 years ago, he helped change the way people fans, writers, even general managers think about run prevention in baseball. Where once we used to throw most of the blame for a hit on the pitcher who gave it up, McCracken helped us realize that a slew of other factors go into whether a ball hit into play falls for a hit. For many people in the game and others who simply watch it, our ability to recognize the influence of defense, park effects, and dumb luck can be traced back to that one little article".[10] DIPS ERA was added to ESPN.com's Sortable Stats in 2004.[11]

Alternate formulae
Each of the following formulas uses innings pitched (IP), a measure of the number of outs a team made while a pitcher was in the game.[12] Since most outs rely on fielding, the results from calculations using innings pitched are not truly independent of team defense. While the creators of DICE, FIP and similar statistics all suggest they are "defense independent", others have pointed out that their formulas involve innings pitched (IP). Innings pitched is a statistical measure of how many outs were made while a pitcher was pitching. This includes those made by fielders who are typically involved in more than two thirds of the outs. These critics claim this makes pitchers' DICE or FIP highly dependent on the defensive play of their fielders.[13]
DICE

A simple formula, known as Defense-Independent Component ERA (DICE),[14] was created by Clay Dreslough in 1998:

In that equation, "HR" is home runs, "BB" is walks, "HBP" is hit batters, "K" is strikeouts, and "IP" is innings pitched. That equation gives a number that is better at predicting a pitcher's ERA in the following year than the pitcher's actual ERA in the current year[citation needed] .
FIP

Tom Tango independently derived a similar formula, known as Fielding Independent Pitching,[15] which is very close to the results of dERA and DICE.

In that equation, "HR" is home runs, "BB" is walks, "K" is strikeouts, and "IP" is innings pitched. That equation usually gives you a number that is nothing close to a normal ERA, so the equation used is more often (but not always) this one:

That equation gives a number that is much closer to a pitcher's potential ERA. The Hardball Times, a popular baseball statistics website, uses a slightly different FIP equation, instead using 3*(BB+HBP-IBB) rather than simply 3*(BB) where "HBP" stands for batters hit by pitch and "IBB" stands for intentional base on balls.[16]
xFIP

Dave Studeman of The Hardball Times derived Expected Fielding Independent Pitching (xFIP), a regressed version of FIP. Calculated like FIP, it replaces a pitcher's actual homerun total with an expected homerun total (xHR), where xHR is calculated using the league average of 10.6% HR/FB (home runs per fly ball).

Hits per nine innings


From Wikipedia, the free encyclopedia Jump to: navigation, search This article does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.(December 2009)

In baseball statistics, hits per nine innings (denoted by H/9) is the average number of hits allowed by a pitcher in a nine inning period; calculated as: (hits allowed x 9) / innings pitched. This is a measure of a pitcher's success based on the number of all outs he records. Compared to a pitcher's batting average against, a pitcher's H/9 benefits from sacrifice bunts, double plays, runners caught stealing, and outfield assists, but it is hurt by some errors. Unlike batting average against, a pitcher's H/9 benefits from outs that are not related to official at bats, as they are recorded on runners after they have reached base.

Strikeouts per 9 innings pitched


From Wikipedia, the free encyclopedia (Redirected from Strikeouts per nine innings) Jump to: navigation, search

In baseball statistics, strikeouts per 9 innings pitched (K/9, SO/9, or SO/9IP) is the mean of strikeouts, (or K's) by a pitcher per nine innings pitched. It is determined by multiplying the number of strikeouts by nine and dividing by the number of innings pitched. To qualify, a pitcher must have pitched 1000 innings, which generally limits the list to starters. A separate list is maintained for relievers with 300 innings pitched or 200 appearances. The all-time leader in this statistic is retired pitcher Randy Johnson (10.61). The only two other players who have averaged over 10 are Kerry Wood (10.32) and Pedro Martnez (10.04).[1] Since Wood's retirement in May of 2012, Tim Lincecum (9.7636) took over as the active leader.[2] Among qualifying relievers, Rob Dibble (12.17) is the all-time leader in strikeouts per nine innings.[3][4][5] Active leader David Robertson (12.03)[6] is the only other qualifying reliever averaging more than 12. One effect of K/9 is that it may reward or "inflate" the numbers for pitchers with high batting averages on balls in play (BABIP). Two pitchers who strike out identical percentages of hitters, but have varying BABIPs, will have different K/9 rates since one pitcher will pitch fewer innings to face the same number of batters. For example, Mariano Rivera has a career K/9 rate of 8.32, below the 9.04 of Norm Charlton, despite the two pitchers striking out the same percentage of hitters they faced.[7]

Strikeout-to-walk ratio
From Wikipedia, the free encyclopedia Jump to: navigation, search This article does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.(December 2009)

In baseball statistics, strikeout-to-walk ratio (K/BB) is a measure of a pitcher's ability to control pitches; calculated as: strikeouts divided by bases on balls. A pitcher that possesses a great K/BB ratio is usually a dominant power pitcher, such as Randy Johnson, Pedro Martnez, Curt Schilling, or Ben Sheets. However, in 2005, Minnesota Twinsstarting pitcherCarlos Silva easily led the major leagues in K/BB ratio with 7.89:1, despite only striking out 71 batters over 188 innings pitched; he walked only nine batters. A hit by pitch is not counted statistically as a walk and therefore not counted in the strikeoutto-walk ratio. At youth levels where hit by pitches are more common, including hit by pitches may be a more useful statistic. Walks plus hits per inning pitched can also be used to compare pitchers.

Batting average against


From Wikipedia, the free encyclopedia (Redirected from Opponents batting average) Jump to: navigation, search

In baseball statistics, batting average against (denoted by BAA or AVG), or opponents' batting average (denoted by OBA) is a statistic that measures a pitcher's ability to prevent hits during official at bats. It can alternatively be described as the league's hitters' combined batting average against the pitcher. It is calculated as: Hits Allowed divided by (Batters Faced minus Walks minus Hit Batsmen minus Sacrifice Hits minus Sacrifice Flies minus Catcher's Interference).[1]

It is calculated as: for which:


BF is the number of batters faced by the pitcher BB is the number of base on balls HBP is the number of hit batsmen

SH is the number of sacrifice hits SF is the number of sacrifice flies CINT is the number of catcher's interference

NERD (sabermetrics)
From Wikipedia, the free encyclopedia Jump to: navigation, search

In baseball statistics, NERD (a wink towards towards the mnemonic "Narration, Exposition, Reflection, Description") is a quantitative measure of expected aesthetical value. NERD was originally created by Carson Cistulli[1] and is part of his project of exploring the "art" of sabermetric research.[2] The original NERD formula only took into account the pitcher's expected performance[1] while the current model factors in the entire team's performance.[3][4]

Contents

1History 2NERD pitching 3Team NERD 4References

History
The premise for NERD was developed in Cistulli's piece "Why We Watch" in which he establishes the five reasons that baseball continues to captivate the American imagination from game to game: "Pitching Matchups," "Statistically Notable (or Otherwise Compelling) Players," "Rookies (and Debuts)," "Seasonal Context," and "Quality of Broadcast".[5] Fellow sabermatrician Rob Neyer, who had collaborated with Cistulli on this piece,[6][7] wrote "the only thing missing [...] is a points system that would let us put a number on each game"[6] and on June 2, 2010 Cistulli unveiled the NERD Pitching formula.[1]

NERD pitching
NERD pitching tries to determine which pitchers will be the most aesthetically appealing to watch for a baseball fan and is both a historical and a predictive statistic.[8] The NERD pitching formula uses a player's standard deviations from the mean (a weighted z-score[9]) of the DIPS statistic xFIP, swinging strike percentage, overall strike percentage, and the differential between the pitcher's ERA and xFIP to determine a quantitative value for each pitcher.[1][10]

The factor of 4.69 is added to make the number fit on a 0 to 10 scale. While there has been some disagreement on the calculation of Cistulli's luck component,[11] the general consensus

among sports writers seems to be that a player with a below-average ERA and an aboveaverage xFIP has been "unlucky".[12][13][14]

Team NERD
Following the model of his Pitching NERD, Team NERD tries to give a quantitative value to the aesthetic value of each of the 30 baseball teams. For factors it accounts for "Age," "ParkAdjusted wRAA," "Park-Adjusted Home Run per Fly Ball (HR/FB)," "Team Speed," "Bullpen Strength," "Team Defense," "Luck" (Base Runs Actual Runs Scored), and "Payroll".[3]

In a recent interview Cistulli admitted that there is a disconnect between The Rays high tNERD rating and low attendance saying that he is considered adding a "park-adjustment" to his formula which would reflect either the stadium itself or "attendance relative to the stadium's capacity"[15] but overall reception of this statistic has been positive[16][17] and Fangraphs started reporting Team NERD in Cistulli's One Night Only columns beginning August 23, 2010.[18]

Walks plus hits per inning pitched


From Wikipedia, the free encyclopedia Jump to: navigation, search

In baseball statistics, walks plus hits per inning pitched (WHIP) is a sabermetric measurement of the number of baserunners a pitcher has allowed per inning pitched. It is a measure of a pitcher's ability to prevent batters from reaching base. The stat was invented in 1979 by Daniel Okrent.

While earned run average (ERA) measures the runs a pitcher gives up, WHIP more directly measures a pitcher's effectiveness against the batters faced. It is calculated by adding the number of walks and hits allowed and dividing this sum by the number of innings pitched; therefore, the lower a pitcher's WHIP, the better his performance. One key distinction between WHIP and ERA is that the former will continue to rise as long as batters reach base. If an error is committed with two outs in an inning, any runs scored beyond that point in the same inning will be considered unearned and will not cause that pitcher's ERA to rise. A WHIP of 1.00 or lower over the course of a season will often rank among the league leaders in Major League Baseball (MLB). WHIP is one of the few sabermetric statistics to enter mainstream baseball usage. (On-base plus slugging, or OPS, a comparable measurement of the ability of a hitter, is another

example.) It is one of the most commonly used statistics in fantasy baseball, and is standard in fantasy leagues that employ 44, 55, and 66 formats. The lowest single-season WHIP in MLB history is 0.7373, recorded by Pedro Martnez with the Boston Red Sox in 2000, which broke the previous record of 0.77 by Guy Hecker of the Louisville Eclipse. Cleveland Indians right-hander Addie Joss currently holds the MLB record for the lowest career WHIP, with a 0.9678 WHIP in 2,327 innings. Chicago White SoxspitballerEd Walsh ranks second, with a 0.9996 WHIP in 2,964 innings, the lowest career WHIP for a qualified pitcher with 10 or more seasons pitched. Reliever Mariano Rivera ranks third among qualified pitchers with a career WHIP of 1.004 in 1,245 innings, the lowest mark by any pitcher from the live-ball era. Providence Grays and New York Gothams righthander Monte Ward is fourth all time with a career WHIP of 1.0440, followed by Pedro Martnez, whose 1.0544 career WHIP is the lowest of any starting pitcher from the live-ball era.

Ultimate zone rating


From Wikipedia, the free encyclopedia Jump to: navigation, search

Ultimate zone rating (UZR) is a sabermetric statistic used to measure fielding. It compares the event that actually happened (hit/out/error) to data on similarly hit balls in the past to determine how much better or worse the fielder did than the "average" player. UZR divides a baseball field into multiple zones and assigns individual fielders responsibility for those zones.[1] UZR calculations are provided at Fangraphs by Mitchel Lichtman. Proponents of the statistic advise that defense is best judged over three-year spans, as a given year contains a relatively small sample and can result in large statistical swings.[2]Major League Baseball shortstop David Eckstein says "a lot of defense is putting yourself in the right position to make plays." Josh Stein, San Diego Padres director of baseball operations, said UZR "can be skewed if the player is not starting from the exact middle of [UZR's zone] chart."[1]

Value over replacement player


From Wikipedia, the free encyclopedia Jump to: navigation, search

In baseball, value over replacement player (or VORP) is a statistic popularized by Keith Woolner that demonstrates how much a hitter contributes offensively or how much a pitcher contributes to his team in comparison to a fictitious "replacement player," who is an average fielder at his position and a below average hitter.[1][2] A replacement player performs at "replacement level," which is the level of performance an average team can expect when trying to replace a player at minimal cost, also known as "freely available talent."

VORP's usefulness is in the fact that it measures contribution at the margin (as in marginal utility). Other statistics compare players to the league average, which is good for cross-era analysis (example: 90 runs created in 1915 are much better than 90 RC in 1996, because runs were more scarce in 1915). However, league-average comparisons break down when considering a player's total, composite contribution to a team. Baseball is a zero-sum game; in other words, one team can only win if another loses. A team wins by scoring more runs than its opponent. It follows, then, that a contribution of any runs helps a team toward a win, no matter how small the contribution. However, the Major Leagues are highly competitive, and talent distribution in baseball does not resemble the traditional "bell curve" of a normal distribution; rather, the majority of players fall within the category of "below-average" or worse. (Since only the most talented baseball players make the Major Leagues, if all Americans' baseball talent was distributed on a bell curve then the Major Leagues would only see the uppermost edge of it, resulting in a "right-skewed" distribution.) Therefore, the so-called "average player" does not have a value of zero, like in Pete Palmer's Total Player Rating,[citation needed] but instead is a valued commodity. One alternative is to rank players using "counting stats" -simply their gross totalsbut this is unacceptable as well, since it is likely that the contribution a marginal player makes, even if it does help a team win one game, is not enough to justify his presence in the Majors. This is where the concept of the replacement level enters the picture. VORP is a cumulative stat or counting stat, not a projected stat. For example, if Bob Jones has a VORP of +25 runs after 81 games, he has contributed 25 more runs of offense to his team than the theoretical replacement player would have, over 81 games. As Bob Jones continues to play the rest of the season, his VORP will increase or decrease, depending upon his performance, and settle at a final figure, e.g., +50 runs, at the end of the season.

Contents

1VORP for Hitters 2VORP for Pitchers 3See also 4Citations 5References

VORP for Hitters


The currency of baseball is the out. There is a finite number of outs that a team can make in one game, and it is almost always 27 (or 3 outs/inning * 9 innings/game). A player consumes these outs to create runs, and at the simplest level, runs and outs are the only truly meaningful stats in baseball. Outs are calculated by simply taking at-bats and subtracting hits, then adding in various outs that don't count toward at-bats: sacrifice hits, sacrifice flies, caught stealing, and grounded into double-play. Runs may be estimated by one of many run-approximation methods: Bill James' runs created, Pete Palmer's linear weights,[citation needed]BaseRuns, etc. Baseball Prospectus author Keith Woolner uses Clay Davenport'sEquivalent Runs in the calculation of VORP. Armed with runs and outs (for the player and that player's league), one can finally calculate VORP.

Critics of VORP take issue with where the formula's arbitrary "replacement level" is set.[citation needed] Many equations and methods exist for finding the replacement level, but most will set the level somewhere around 80% of the league average, in terms of runs per out.[citation needed] There are two exceptions to this, though: catchers, who shoulder a larger defensive responsibility than any other player in the lineup (and are therefore more scarce), have a replacement level at 75% of the league average. At the other end of the defensive spectrum, first basemen and designated hitters must produce at a level above 85% of the average to be considered better than "replacement level," since defense is not a big consideration at either position (it is not a consideration at all for the DH). Therefore, to calculate VORP one must multiply the league's average runs per out by the player's total outs; this provides the number of runs an average player would have produced given that certain number of outs to work with. Now multiply that number (of runs) by .8, or whatever percentage of average the replacement level is designated to be; the result is the number of runs you could expect a "replacement player" to put up with that number of outs. Simply subtract the replacement's runs created from the player's actual runs created, and the result is VORP. This is not the final adjustment, however: while the replacement's run total will be parkneutral (by definition, because replacement numbers are derived from league averages), the player's raw numbers won't be. Before calculating the VORP, the individual player stats must be normalized via park factors to eliminate the distortions that can be created by each ballpark, especially extreme parks like Coors Field in Denver (where the thin high-altitude air allows baseballs to travel farther than at sea level, although the humidor has significantly decreased the runs scored in Coors Field, to the extent that Denver is no longer considered a pure hitter's haven)[3] and Petco Park in San Diego (where the heavier sea air couples with distant fences to suppress run-scoring).[4] After the final adjustment, the resultant VORP may be used to estimate how "valuable" the player in question is by providing a good picture of that player's marginal utility.

VORP for Pitchers


VORP can also be calculated for pitchers, as a measurement of the number of runs he has prevented from scoring that a replacement-level pitcher would have allowed. The concept is essentially the same as it was for hitters: using the player's playing time (in a pitcher's case, his innings pitched), determine how many runs a theoretical "replacement" would have given up in that playing time (at the most basic level, the replacement level is equal to 1 plus the league's average runs per game), and subtract from that number the amount actually allowed by the pitcher to arrive at VORP. As an aside, Run Average is used as a measure of pitcher quality rather than Earned Run Average.[citation needed] ERA is heavily dependent on the concept of the error, which most sabermetricians have tried to shy away from because it is a scorer's opinion; also, we are trying to determine VORP in units of runs, so a calculation that uses earned runs is not of very much use to us in this instance. The "old" definition of pitching VORP, as alluded to above, was simply:[citation needed]
VORP = (((League Runs/Game + 1) - RAvg)/9)*Innings Pitched

However, further research[which?] indicated that starting pitchers and relief pitchers have different replacement thresholds, as it is easier to put up a low RAvg in relief than as a

starter.[citation needed] Armed with that knowledge, Baseball Prospectus 2002 published the current formula for determining the replacement level for pitchers:
For starting pitchers, Repl. Level = 1.37 * League RA - 0.66 For relief pitchers, Repl. Level = 1.70 * League RA - 2.27

Therefore, the current formula for VORP is:[citation needed]


VORP = ((Repl. Level - RAvg)/9)*Innings Pitched

As was the case with hitters, run average should be normalized for park effects before VORP is calculated. Pitcher VORP is on the same scale as that of hitters.

Win shares
From Wikipedia, the free encyclopedia Jump to: navigation, search This article is about the sports statistic. For the book, see Win Shares (book). It has been suggested that Win Shares (book) be merged into this article. (Discuss) Proposed
since September 2010.

This article's tone or style may not reflect the encyclopedic tone used on Wikipedia. See Wikipedia's guide to writing better articles for suggestions.(March 2008)

Win shares is the name of the metric Bill James describes in his 2002 book Win Shares. It considers statistics for baseball and basketball players, in the context of their team and in a sabermetric way, and assigns a single number to each player for his contributions for the year. A win share represents one-third of a team win, by definition.[1] If a team wins 80 games in a season, then its players will share 240 win shares. The formula for calculating win shares is complicated; it takes up pages 16100 in the book. The general approach is to take the team's win shares (i.e., 3 times its number of wins), then divide them between offense and defense. In baseball, all pitching, hitting and defensive contributions by the player are taken into account. Statistics are adjusted for park, league and era. On a team with equal offensive and defensive prowess, hitters receive 48% of the win shares and those win shares are allocated among the hitters based on runs created. An estimation is then made to decide what amount of the defensive credit goes to pitchers and what amount goes to fielders. Pitching contributions typically receive 35% (or 36%) of the win shares, defensive contributions receive 17% (or 16%) of the win shares. The pitching contributions are allocated among the pitchers based on runs prevented, the pitchers' analogue to runs created. Fielding contributions are allocated among the fielders based on a number of assumptions and a selection of traditional defensive statistics.[2]

In Major League Baseball, based on a 162-game schedule, a typical All-Star might amass 20 win shares in a season. More than 30 win shares (i.e. the player is directly responsible for 10 wins by his team) is indicative of MVP-level performance, and 40+ win shares represents an exceptional, historic season. For pitchers, Win Shares levels are typically lowerin fact, they often come close to mirroring actual wins. Win shares differs from other sabermetric player rating metrics such as Total player rating and VORP in that it is based on total team wins, not runs above replacement.

Contents

1Criticism of win shares 2Basketball 3References 4External links

Criticism of win shares

Players cannot be awarded "loss shares", or negative win shares, by definition. Some critics of the system argue that negative win shares are necessary. In defense of the system, proponents argue that very few players in a season would amass a negative total, if it were possible. However, critics argue, when one player does amass a negative total, he is zeroed out, thus diminishing other players' win-share totals. In an attempt to fix this error, some have developed a modified system in which negative win shares are indeed possible. The allocation of win shares 48% offense and 52% defense is justified by James in that pitchers typically receive less credit than hitters in win shares and would receive far too few win shares if they were divided evenly. One criticism of this metric is that players who play for teams that win more games than expected, based on the Pythagorean expectation, receive more win shares than players whose team wins fewer games than expected. Since a team exceeding or falling short of its Pythagorean expectation is generally acknowledged as chance, some believe[who?] that credit should not be assigned purely based on team wins. However, team wins are the bedrock of the system, whose purpose is to assign credit for what happened. Win shares are intended to represent player value (what they were responsible for) rather than player ability (what the player's true skill level is).

Within the sabermetric community there is ongoing debate as to the details of the system. The Hardball Times has developed its own Win Shares, as well as a number of derivative statistics, such as Win Shares Above Bench, Win Shares Percentage, Win Shares Above Average, and All Star Win Shares.

Wins Above Replacement


From Wikipedia, the free encyclopedia (Redirected from Wins above replacement) Jump to: navigation, search

Wins Above Replacement or Wins Above Replacement Player, commonly known as WAR or WARP, is a non-standardized sabermetric baseball statistic developed to determine the value of "a players total contributions to their team",[1] derived from baserunning, batting, fielding, and pitching.[2] It is claimed to show the number of additional wins a player would contribute to a team compared to a replacement level player at that position, usually a minor league player or bench player.[2] The purpose of the WAR framework is to determine how much better a player is compared to a readily available substitute with minimum marginal acquisition costs.[3][4] A team of replacement-level players would be expected to win a baseline minimum number of games, typically 4050, per 162 game season.[2] WAR does not reflect a true talent level, but rather is a descriptor of the value of contributions made by a player.[5]

Contents

1Calculation o 1.1Position players 1.1.1Baseball Reference 1.1.2Fangraphs o 1.2Pitchers 2Analysis 3Use 4See also 5Notes 6References

Calculation
There is no clearly established formula for WAR. Sources that provide the statistic calculate it differently. These include Baseball Prospectus, Baseball Reference, and Fangraphs. All of these sources publish the method they use to calculate WAR, and all use similar basic principles to do so. The version published by Baseball Prospectus is named WARP,[6] that by Baseball Reference is named rWAR ("r" derives from "Rally" or "RallyMonkey", a nickname for Sean Smith, who created the statistic) or bWAR,[7] and that for Fangraphs is named fWAR.[8] Compared to rWAR, the calculation of fWAR places greater emphasis on peripheral statistics.[2] WAR values are scaled equally for pitchers and batters, that is pitchers and position players will have roughly the same WAR if their contribution to their team is deemed similar. However, the values are calculated differently for pitchers and position players: position players are evaluated using statistics for fielding and hitting, while pitchers are evaluated using statistics related to the opposing batters' hits, walks and strikeouts in Fangraph's version and runs allowed per 9 innings with a team defense adjustment for Baseball Reference's version. Because the independent WAR frameworks are calculated differently, they do not have the same scale[9] and cannot be used interchangeably in an analytical context.

Position players Baseball Reference

Baseball Reference uses six components to calculate WAR for position players:[10] The components are batting runs, baserunning runs, runs added or lost due to grounding into double plays in double play situations, fielding runs, positional adjustment runs, and replacement level runs (based on playing time). The first five factors are compared to league average, so a value of 0 represents an average player.

The term may be calculated from the first five factors, and the other term from the remaining factor.[10] Batting runs depends on weighted Runs Above Average (wRAA), weighted to the offense of the league, and is calculated from wOBA.[11]

where

Here, "AB" is the number of at bats, "BB" the number of base on balls ("uBB" is unintentional base on balls and "IBB" is intentional base on balls), HBP the number of times hit by pitch, "SF" the number of sacrifice flies, "SH" the number of sacrifice hits, "1B" the number of singles, "2B" the number of doubles, "3B" the number of triples, "HR" the number of home runs, "SB" the number of stolen bases, and "CS" the number of caught stealing.[11] to represent weighting coefficients. Baseball Reference eliminates pitcher batting results from its data, computes linear weights and wOBA coefficients for each league, then scales the values for each league and season.[11] The positional adjustment is a value dependent on the players position: +10.0 for a catcher, 10 for a first baseman, +3.0 for a second baseman, +2.0 for a third baseman, +7.5 for a shortstop, 7.5 for a left fielder, +2.5 for a center fielder, 7.5 for a right fielder, and 15.0 for a designated hitter.[11] These values are set assuming 1,350 innings played (150 games of 9 innings).[11] A player's positional adjustment is the sum of the positional adjustment for each position played by the player scaled to the number of games played by the player at that position, normalized to 1,350 innings.[11]
Fangraphs

The Fangraphs formula for position players involves offense, defense, and base running.[12] These are measured using weighted Runs Above Average, Ultimate zone rating (UZR), and Ultimate base running (UBR), respectively.[12] These values are adjusted using park factors, and a positional adjustment is applied, resulting in a player's "value added above league

average". To this is added a scaled value to reflect the player's value compared to a replacement-level player, which is assumed to be 20 runs below average per 600 plate appearances. All four values are measured in runs.

The positional adjustment is a value dependent on the players position: +12.5 for a catcher, 12.5 for a first baseman, +2.5 for a second or third baseman, +7.5 for a shortstop, 7.5 for a left fielder, +2.5 for a center fielder, 7.5 for a right fielder, and 17.5 for a designated hitter.[13] These values are scaled to the number of games played by the player at each position.[13]
Pitchers

Baseball Reference, at the most basic level, uses two components to calculate WAR for pitchers: Runs Allowed (both earned and unearned) and Innings Pitched.[14]

Analysis
In 2009, Dave Cameron stated that fWAR does an "impressive job of projecting wins and losses".[15] He found that a team's projected record based on fWAR and that team's actual record has a strong correlation (correlation coefficient of 0.83), and that every team was within two standard deviations (=6.4 wins).[15] In 2012, Glenn DuPaul conducted a regression analysis comparing the cumulative rWAR of five randomly selected teams per season (from 1996 to 2011) against those teams' realized win totals for those seasons. He found that the two were highly correlated, with a correlation coefficient of 0.91, and that 83% of the variance in wins was explained by fWAR (R2=0.83).[5] The standard deviation was 2.91 wins. The regression equation was:

which was close to the expected equation:

in which a team of replacement-level players is expected to have a .320 winning percentage, or 52 wins in a 162 game season. To test fWAR as a predictive tool, DuPaul executed a regression between a team's cumulative player WAR from the previous year to the team's realized wins for that year. The resultant regression equation was:[5]

which has a statistically significant correlation of 0.59, meaning that 35% of the variance in team wins could be accounted for by the cumulative fWAR of its players from the previous season.[5]

Use
ESPN publishes the Baseball Reference version of WAR on its statistics pages for position players and pitchers.[2] Bill James states that there is a bias favouring players from earlier eras because there was greater variance in skills at the time, so "the best players were further from the average then they are now".[2] That is, in modern baseball, it is more difficult for a player to exceed the abilities of their peers than it was in the 1800s and the dead-ball and live-ball eras of the 1900s.[2] Nearing the end of the 2012 Major League Baseball season and afterward, there was much debate about which player should win the Major League Baseball Most Valuable Player Award for the American League.[16] The two candidates considered by most writers were Miguel Cabrera, who won the Triple Crown, and Mike Trout, a rookie who led Major League Baseball in WAR.[17] The debate focused on the use of traditional baseball statistics, such as RBIs and home runs, and sabermetric statistics such as WAR.[16] Cabrera led the American League in batting average, home runs, and RBIs, but Trout was considered a more complete player;[18] whereas Cabrera was just below league average defensively (0.2 defensive WAR), Trout was third best among center fielders (2.2 defensive WAR).[19] Cabrera would win the Award, with 22 of 28 first-place votes from the Baseball Writers Association of America.[20] Some sabermetricians "have been distancing themselves from the importance of single-season WAR values"[5] because some of the defensive metrics incorporated into WAR calculations have significant variability. During the 2012 season, the Toronto Blue Jays employed an infield shift against some lefthanded batters, such as David Ortiz or Carlos Pea, in which third baseman Brett Lawrie would be assigned to shallow right field. This resulted in a very high Defensive Runs Saved (DRS) total for Lawrie,[21] and hence a high rWAR, which uses DRS as a component.[22] Ben Jedlovec, an analyst for DRS creator Baseball Info Solutions, said that Lawrie was "making plays in places where very few third basemen are making those plays" because of the "very optimal positioning by the Blue Jays".[23] Another fielding metric, Ultimate zone rating (UZR), uses the DRS data but excludes runs saved as a result of a shift.[23]

Pythagorean expectation
From Wikipedia, the free encyclopedia Jump to: navigation, search

Pythagorean expectation is a formula invented by Bill James to estimate how many games a baseball team "should" have won based on the number of runs they scored and allowed. Comparing a team's actual and Pythagorean winning percentage can be used to evaluate how lucky that team was (by examining the variation between the two winning percentages). The name comes from the formula's resemblance to the Pythagorean theorem.[1] The basic formula is:

where Win is the winning ratio generated by the formula. The expected number of wins would be the expected winning ratio multiplied by the number of games played.

Contents

1Empirical origin 2"Second-order" and "third-order" wins 3Theoretical explanation 4Use in basketball 5Use in pro football 6See also 7Notes 8External links

Empirical origin
Empirically, this formula correlates fairly well with how baseball teams actually perform. However, statisticians since the invention of this formula found it to have a fairly routine error, generally about 3 games off. For example, in 2002, the New York Yankees scored 897 runs and allowed 697 runs. According to James' original formula, the Yankees should have won 62.35% of their games.

Based on a 162-game season, the Yankees should have won 101.07 games. The 2002 Yankees actually went 10358.[2] In efforts to fix this error, statisticians have performed numerous searches to find the ideal exponent. If using a single-number exponent, 1.83 is the most accurate, and the one used by baseballreference.com, the premier website for baseball statistics across teams and time.[3] The updated formula therefore reads as follows:

The most widely known is the Pythagenport formula[4] developed by Clay Davenport of Baseball Prospectus:

He concluded that the exponent should be calculated from a given team based on the team's runs scored (R), runs allowed (RA), and games (G). By not reducing the exponent to a single number for teams in any season, Davenport was able to report a 3.9911 root-mean-square error as opposed to a 4.126 root-mean-square error for an exponent of 2.[4] Less well known but equally (if not more) effective is the Pythagenpat formula, developed by David Smyth.[5]

Davenport expressed his support for this formula, saying: After further review, I (Clay) have come to the conclusion that the so-called Smyth/Patriot method, aka Pythagenpat, is a better fit. In that, X = ((rs + ra)/g)0.285, although there is some wiggle room for disagreement in the exponent. Anyway, that equation is simpler, more elegant, and gets the better answer over a wider range of runs scored than Pythagenport, including the mandatory value of 1 at 1 rpg.[6] These formulas are only necessary when dealing with extreme situations in which the average number of runs scored per game is either very high or very low. For most situations, simply squaring each variable yields accurate results. There are some systematic statistical deviations between actual winning percentage and expected winning percentage, which include bullpen quality and luck. In addition, the formula tends to regress toward the mean, as teams that win a lot of games tend to be underrepresented by the formula (meaning they "should" have won fewer games), and teams that lose a lot of games tend to be overrepresented (they "should" have won more).

"Second-order" and "third-order" wins


In their Adjusted Standings Report,[7]Baseball Prospectus refers to different "orders" of wins for a team. The basic order of wins is simply the number of games they have won. However, because a team's record may not reflect its true talent due to luck, different measures of a team's talent were developed. First-order wins, based on pure run differential, are the number of expected wins generated by the "pythagenport" formula (see above). In addition, to further filter out the distortions of luck, sabermetricians can also calculate a team's expected runs scored and allowed via a runs created-type equation (the most accurate at the team level being Base Runs). These formulas result in the team's expected number of runs given their offensive and defensive stats (total singles, doubles, walks, etc.), which helps to eliminate the luck factor of the order in which the team's hits and walks came within an inning. Using these stats, sabermetricians can calculate how many runs a team "should" have scored or allowed.

By plugging these expected runs scored and allowed into the pythagorean formula, one can generate second-order wins, the number of wins a team deserves based on the number of runs they should have scored and allowed given their component offensive and defensive statistics. Third-order wins are second-order wins that have been adjusted for strength of schedule (the quality of the opponent's pitching and hitting). Second- and third-order winning percentage has been shown to predict future actual team winning percentage better than both actual winning percentage and first-order winning percentage.

Theoretical explanation
Initially the correlation between the formula and actual winning percentage was simply an experimental observation. In 2003, Hein Hundal provided an inexact derivation of the formula and showed that the Pythagorean exponent was approximately 2/() where was the standard deviation of runs scored by all teams divided by the average number of runs scored.[8] In 2006, Professor Steven J. Miller provided a statistical derivation of the formula[9] under some assumptions about baseball games: if runs for each team follow a Weibull distribution and the runs scored and allowed per game are statistically independent, then the formula gives the probability of winning.[9] More simply, the Pythagorean formula with exponent 2 follows immediately from two assumptions: that baseball teams win in proportion to their "quality", and that their "quality" is measured by the ratio of their runs scored to their runs allowed. For example, if Team A has scored 50 runs and allowed 40, its quality measure would be 50/40 or 1.25. The quality measure for its (collective) opponent team B, in the games played against A, would be 40/50 (since runs scored by A are runs allowed by B, and vice versa), or 0.8. If each team wins in proportion to its quality, A's probability of winning would be 1.25 / (1.25 + 0.8), which equals 50^2 / (50^2 + 40^2), the Pythagorean formula. The same relationship is true for any number of runs scored and allowed, as can be seen by writing the "quality" probability as [50/40] / [ 50/40 + 40/50], and clearing fractions. The assumption that one measure of the quality of a team is given by the ratio of its runs scored to allowed is both natural and plausible.[citation needed] [There are other natural and plausible candidates for team quality measures, which, assuming a "quality" model, lead to corresponding winning percentage expectation formulas that are roughly as accurate as the Pythagorean ones.] The assumption that baseball teams win in proportion to their quality is not natural, but is plausible. It is not natural because the degree to which sports contestants win in proportion to their quality is dependent on the role that chance plays in the sport. If chance plays a very large role, then even a team with much higher quality than its opponents will win only a little more often than it loses. If chance plays very little role, than a team with only slightly higher quality than its opponents will win much more often than it loses. The latter is more the case in basketball, for various reasons, including that many more points are scored than in baseball (giving the team with higher quality more opportunities to demonstrate that quality, with correspondingly fewer opportunities for chance or luck to allow the lowerquality team to win.) Baseball has just the right amount of chance in it to enable teams to win roughly in proportion to their quality, i.e. to produce a roughly Pythagorean result with exponent two. Basketball's higher exponent of around 14 (see below) is due to the smaller role that chance plays in basketball. And the fact that the most accurate (constant) Pythagorean exponent for baseball is around 1.83, slightly less than 2, can be explained by the fact that there is (apparently) slightly

more chance in baseball than would allow teams to win in precise proportion to their quality. Bill James realized this long ago when noting that an improvement in accuracy on his original Pythagorean formula with exponent two could be realized by simply adding some constant number to the numerator, and twice the constant to the denominator. This moves the result slightly closer to .500, which is what a slightly larger role for chance would do, and what using the exponent of 1.83 (or any positive exponent less than two) does as well. Various candidates for that constant can be tried to see what gives a "best fit" to real life data. The fact that the most accurate exponent for baseball Pythagorean formulas is a variable that is dependent on the total runs per game is also explainable by the role of chance, since the more total runs scored, the less likely it is that the result will be due to chance, rather than to the higher quality of the winning team having been manifested during the scoring opportunities. The larger the exponent, the farther away from a .500 winning percentage is the result of the corresponding Pythagorean formula, which is the same effect that a decreased role of chance creates. The fact that accurate formulas for variable exponents yield larger exponents as the total runs per game increases is thus in agreement with an understanding of the role that chance plays in sports. In his 1981 Baseball Abstract, James explicitly developed another of his formulas, called the log5 formula (which has since proven to be empirically accurate), using the notion of 2 teams having a face-to-face winning percentage against each other in proportion to a "quality" measure. His quality measure was half the team's "wins ratio" (or "odds of winning"). The wins ratio or odds of winning is the ratio of the team's wins against the league to its losses against the league. [James did not seem aware at the time that his quality measure was expressible in terms of the wins ratio. Since in the quality model any constant factor in a quality measure eventually cancels, the quality measure is today better taken as simply the wins ratio itself, rather than half of it.] He then stated that the Pythagorean formula, which he had earlier developed empirically, for predicting winning percentage from runs, was "the same thing" as the log5 formula, though without a convincing demonstration or proof. His purported demonstration that they were the same boiled down to showing that the two different formulas simplified to the same expression in a special case, which is itself treated vaguely, and there is no recognition that the special case is not the general one. Nor did he subsequently promulgate to the public any explicit, quality-based model for the Pythagorean formula. As of 2013, there is still little public awareness in the sabermetric community that a simple "teams win in proportion to quality" model, using the runs ratio as the quality measure, leads directly to James's original Pythagorean formula. In the 1981 Abstract, James also says that he had first tried to create a "log5" formula by simply using the winning percentages of the teams in place of the runs in the Pythagorean formula, but that it did not give valid results. The reason, unknown to James at the time, is that his attempted formulation implies that the relative quality of teams is given by the ratio of their winning percentages. Yet this cannot be true if teams win in proportion to their quality, since a .900 team wins against its opponents, whose overall winning percentage is roughly .500, in a 9 to 1 ratio, rather than the 9 to 5 ratio of their .900 to .500 winning percentages. The empirical failure of his attempt led to his eventual, more circuitous (and ingenious) and successful approach to log5, which still used quality considerations, though without a full appreciation of the ultimate simplicity of the model and of its more general applicability and true structural similarity to his Pythagorean formula.

Use in basketball

American sports executive Daryl Morey was the first to adapt James' Pythagorean expectation to professional basketball while a researcher at STATS, Inc.. He found that using 13.91 for the exponents provided an acceptable model for predicting won-lost percentages:

Daryl's "Modified Pythagorean Theorem" was first published in STATS Basketball Scoreboard, 1993-94.[10] Noted basketball analyst Dean Oliver also applied James' Pythagorean theory to professional basketball. The result was similar. Another noted basketball statistician, John Hollinger, uses a similar Pythagorean formula except with 16.5 as the exponent.

You might also like