The College Baseball Ratings Page -- Frequently Asked Questions

Boyd's World-> The College Baseball Ratings Page-> Frequently Asked Questions About the author, Boyd Nation

The College Baseball Ratings Page
Frequently Asked Questions

The Questions:

What are the ISR's?
How are the ISR's computed?
Why are the ISR's needed?
Why don't you include my favorite factor -- such as home field advantage, margin of victory, or past performance?
How can you rank Vine Covered U. over Enormous State U. when ESU beat VCU twice?
Why do you have Podunk State ranked #2 on February 29 when they've never even won their home tournament before?
What are the implied probabilities based on the ISR's?
What are the RPI's?
What are the pseudo-RPI's?
How closely does the selection committee follow the RPI's?
What's wrong with the RPI's?
Who is Boyd Nation, and why should anyone pay attention to this stuff?

The Answers:

What are the ISR's?

The ISR's are the results of an algorithm designed to measure the quality of a team's season to date by combining their winning percentage with the difficulty of their schedule. The algorithm computes all teams simultaneously and attempts to take advantage of inter-regional games more accurately than other rating systems.

How are the ISR's computed?

The basic idea is an iterative one. Begin with all teams set to an even rating -- 100 in this case. Then, for each game played, give each team the value of their opponent's rating plus or minus a factor for winning or losing the game -- 25 in this case. Total all of a team's results, divide by the number of games played, and that's the end of a cycle. Then use those numbers as the start of the next cycle until you get the same results for each team for two consecutive cycles.

Why are the ISR's needed?

While it's still a great game, college baseball suffers from the lack of an accurate rating system for measuring team quality. The traditional polls suffer from voters running on auto-pilot, and the RPI's used by the selection committee have some serious problems with the method used to determine strength of schedule. Because of the small amount of inter-regional play in the sport, some regions tend to be under-represented in the NCAA tournament, and mid-rank large conference teams tend to be unfairly excluded. Although trying to get the selection committee to acknowledge this may be a hopeless case, the ISR's are an attempt to find a better rating system.

Why don't you include my favorite factor -- such as home field advantage, margin of victory, or past performance?

Because I can't measure whether it increases accuracy, and I intentionally don't trust "common sense", because so much of it is wrong when it comes to baseball.

Any rating system for sports is inherently going to have a bit of impreciseness built into it, because sports are inherently random; this is why we bother to watch the games rather than watching a pre-determined art form like film or ballet. This is especially true for college baseball, in part because of the relatively short season and in part because baseball is the most random of major sports. In professional sports, the best football teams generally win 90% of their games, the best basketball teams routinely win 80% of their games, and the best baseball teams struggle to win 66%.

Because of this, it's impossible to determine just how accurate any given ratings system is. It's possible to see how accurate the results "look", and the ISR's do very well in that regard by mid-season. It's possible to see how accurately the regular-season rankings predict the post-season results, but only an extremist who's never actually thought about it would claim that the best team always wins a championship, especially with a format designed more for television than fairness such as the College World Series.

With that in mind, I've chosen to keep the ISR's as simple as possible. I have experimented with many factors, including the ones above, and have failed to find any indication that they provide any better ratings than simply considering the current-season ratings in a straight-forward manner.

How can you rank Vine Covered U. over Enormous State U. when ESU beat VCU twice?

Because ESU really stunk up the joint against Podunk State and VCU swept Our Lady of Perpetual Victories.

One of the basic tenets of the ISR's is that each game is worth the same amount. Big weekend conference series may impress the pollsters more, but mid-week losses to small schools may indicate fatal weaknesses in the bottom half of the pitching rotation. Or they may not, there's no way to know. Given that, a team's entire season must be looked at, and it must be considered in the context of every other team's season. That's too much data for a human brain to get a good feel for, especially if they're primarily focused on one team; that's why we have computers.

Why do you have Podunk State ranked #2 on February 29 when they've never even won their home tournament before?

Early season results generally do not provide enough information for the algorithm to give a clear picture of what's going on. Generally, I only provide early season ratings so that readers can get a feel for how the process develops; otherwise, they can be ignored until about mid-March, when things get more accurate. A good rule of thumb is to ignore the ISR for any team that has played fewer than eight games.

What are the implied probabilities based on the ISR's?

Over the 1998 and 1999 seasons, the results played out like this:

 Gap  Win %

 0- 2 0.507
 2- 4 0.558
 4- 6 0.635
 6- 8 0.674
 8-10 0.706
10-12 0.760
12-14 0.776
14-16 0.845
16-18 0.873
18-20 0.898
20-22 0.896
22-24 0.944
24-26 0.938
26-28 0.950
28-30 0.936
30-32 0.968
32-34 0.985
34-36 1.000
36-38 1.000
38-40 1.000
40-42 1.000
42-44 1.000
44-46 1.000
46-48 1.000

In other words, when a team has had an ISR that was between 2 and 4 points higher than their opponent, they've won 55.8% of the time, for example. These aren't nearly as precise as they appear, of course, but they're fairly consistent between the two years, so it's probably a reasonably good approximation. This becomes more accurate as the year goes on and the ISR's are given more data for accuracy, of course.

What are the RPI's?

The Ratings Power Index is the official NCAA formula designed to aid the selection committee for each sport in choosing the tournament field. It is based on a combination of a team's winning percentage, their opponents' winning percentage, and their opponents' opponents' winning percentage, with bonuses and penalties involved for road wins against top teams or home losses to lower-ranked teams. The official RPI document for baseball is here in Microsoft Word format.

What are the pseudo-RPI's?

The pseudo-RPI's are my best effort at a simulation of the RPI's. The full formula is not released, but my best guess is that the sizes of the bonuses are .001 for wins over teams between 51 and 75, .0035 for wins over teams between 26 and 50, and .006 for wins over teams between 1 and 25. The winning percentages are not the full winning percentage but rather the average of each opponent's winning percentage. I'm still uncertain about the handling of neutral site games.

How closely does the selection committee follow the RPI's?

It varies from year to year -- generally they seem to use it for justification more than guidance. Jim Carr has done a good bit of analysis on this.

What's wrong with the RPI's?

Although things are improving, there's still a very limited amount of inter-regional play in college baseball. This means that in sections of the country with fewer Division I baseball schools, such as the West, the pool of available opponents tends to be smaller, which tends to pull winning percentages towards .500. As a result of the RPI only considering two levels of interconnectedness, teams from these regions tend to be underranked by the RPI's.

Who is Boyd Nation, and why should anyone pay attention to this stuff?

Boyd is a lifelong college baseball fan who has a master's degree in computer science with a focus on algorithm development. The ISR's are intended to improve enjoyment of college baseball by producing better-informed fans; some of us enjoy the games more when we have a feel for how likely certain results are. If that's you, enjoy.

Boyd's World-> The College Baseball Ratings Page-> Frequently Asked Questions About the author, Boyd Nation