A Better Format

Boyd's World-> Breadcrumbs Back to Omaha-> A Better Format (Part 2) About the author, Boyd Nation

A Better Format (Part 2)

Publication Date: June 27, 2000

A Quick Recap

Last week, I discussed the flaws in the current tournament format and discussed my proposal for a better format -- one based on a sequence of best-of-five series culminating in a Final Four in Omaha. This week, I want to discuss one or two of the details of that proposed format and look at a simulated version of this year's tournament in order to make it a bit more concrete.

A few times in here, I refer to probabilities for a team winning a particular game. When I refer to specific values for that, it's based on the ISR-based probabilities discussed in the ratings FAQ. Those are not exact, of course, since exactness is impossible in this context, but the orders of magnitude are almost certainly correct.

Why Best of Five?

Baseball is the most random of major American team sports. One way of proving that is to look at the results in the American professional leagues: The best football teams win around 90% of their games, and the best basketball teams win around 80%. Meanwhile, baseball teams never win even 70%. (OK, I've assured myself of complaining email from the hockey fans, but if I include hockey, then I get mail from the soccer fans, and if I include soccer, then I get mail from the lacrosse fans. It's turtles all the way down, so I'm stopping here.) The numbers are less pronounced in the college ranks where the talent spread is larger, but it's still true that the better team is less likely to win a given game in baseball than in any other sport.

Because of this, it's important for any baseball event designed to choose a better team to improve the odds of that better team winning, and the best way to do that is by playing more games. A 162-game season does a good job of that, although they too manage to muck it up in the postseason while chasing the dollar. Unfortunately, the college season is too short and the number of teams too great for the regular season to pick a best team with any certainty, and besides, people like playoffs. That leads us to consider what happens in shorter formats.

The following chart shows the odds of the better team winning a series of varying length, given the probability of them winning a single game:

Games
1	3	5	7
0.55	0.57	0.59	0.61
0.60	0.65	0.68	0.71
0.65	0.72	0.76	0.80
0.70	0.78	0.84	0.87
0.75	0.84	0.90	0.93
0.80	0.90	0.94	0.97

In other words, a team that has a 60% chance of winning a single game has a 68% chance of winning a five-game series, for example; the more games we play, the better the chance that the better team wins. Note that the chance of the underdog winning never goes away, it just happens less often so that it's more meaningful when it does happen.

So why best-of-five rather than best-of-seven? Notice that the difference there is not, in most cases, as large as the difference from three to five games. There are time limitations on this proposal; realistically, we're going to need to be able to play this with college pitching staffs in roughly the same amount of time as the current format. Given those facts, best-of-five seems a reasonable compromise.

A Sample Tournament

In order to get a feel for how this would play out, I wrote a simple simulator that took the ISR-based probabilities and ran through a sample run of the tournament under my proposed format. I took the ISR top 16 and seeded them accordingly (more on this later); I then took the probabilities and ran through a thousand runs or so of the tournament.

Of those 1000 runs, South Carolina won 325, Stanford won 180, and LSU won 97. On the other end, Mississippi State, Florida, Texas, and Auburn totalled up 46 wins between them, so Cinderella's not dead in this scenario, she just doesn't come out as often. The difference is that, when Cinderella does come out, she hasn't been locked in away in the basement in between grout-cleanings. She's just been given smaller portions of dessert and forced to drive the old Hyundai instead of the Beemer.

Without further ado (actually there's plenty of ado in there between rounds, but you try to find a cliche for that), here's a sample run of the tournament so you can see what the schedule looks like:

Round 1:

Thursday, May 25:

South Carolina 10, Mississippi State 5
Stanford 11, Florida 5
Texas 3, Louisiana State 1
Auburn 8, Arizona State 2
Florida State 9, Nebraska 2
North Carolina 5, Southern California 1
Georgia Tech 1, Houston 0
Clemson 11, Baylor 0

Friday, May 26:

South Carolina 8, Mississippi State 4
Stanford 1, Florida 0
Texas 1, Louisiana State 0
Arizona State 4, Auburn 0
Nebraska 7, Florida State 5
Southern California 5, North Carolina 4
Houston 5, Georgia Tech 0
Clemson 7, Baylor 1

Saturday, May 27:

South Carolina 11, Mississippi State 4 (South Carolina wins 3-0)
Stanford 3, Florida 1 (Stanford wins 3-0)
Louisiana State 3, Texas 2
Arizona State 5, Auburn 2
Florida State 9, Nebraska 6
Southern California 8, North Carolina 1
Georgia Tech 3, Houston 2
Clemson 3, Baylor 1 (Clemson wins 3-0)

Sunday, May 28:

Louisiana State 5, Texas 0
Auburn 10, Arizona State 1
Nebraska 10, Florida State 7
Southern California 5, North Carolina 3 (Southern California wins 3-1)
Georgia Tech 5, Houston 4 (Georgia Tech wins 3-1)

Monday, May 29:

Texas 10, Louisiana State 3 (Texas wins 3-2)
Arizona State 8, Auburn 7 (Arizona State wins 3-2)
Nebraska 11, Florida State 0 (Nebraska wins 3-2)

Note that this schedule allows the first round to utilize a nice feature that is currently underused -- Memorial Day. Whereas today the only games on the holiday are rain-delay catchups (Mississippi State - Florida State, Memorial Day, 1990, ahhhhh), with this schedule we're almost guaranteed some great games on Memorial Day, like these three.

Note also that upsets are not completely eliminated by any means -- #3 LSU heads home early here, for example, and Auburn almost pulls it off -- but they are somewhat rarer, which strikes me as a reasonable balance.

Round 2:

Thursday, June 1:

Clemson 9, South Carolina 8
Stanford 6, Georgia Tech 1
Southern California 11, Texas 5
Arizona State 1, Nebraska 0

Friday, June 2:

South Carolina 8, Clemson 4
Georgia Tech 7, Stanford 6
Texas 5, Southern California 4
Nebraska 1, Arizona State 0

Saturday, June 3:

South Carolina 6, Clemson 1
Stanford 4, Georgia Tech 0
Texas 10, Southern California 8
Nebraska 7, Arizona State 0

Sunday, June 4:

South Carolina 8, Clemson 5 (South Carolina wins 3-1)
Stanford 5, Georgia Tech 2 (Stanford wins 3-1)
Southern California 3, Texas 0
Arizona State 10, Nebraska 3

Monday, June 5:

Texas 8, Southern California 0 (Texas wins 3-2)
Arizona State 2, Nebraska 0 (Arizona State wins 3-2)

There are some really nice series here as Cinderella wears burnt orange to Omaha again, ASU and Nebraska play a great one, and South Carolina comes back nicely against Clemson. South Carolina, Arizona State, Stanford, and Texas head to Omaha.

Round 3:

Friday, June 9:

South Carolina 10, Arizona State 7
Stanford 3, Texas 2

Saturday, June 10:

South Carolina 3, Arizona State 0
Stanford 4, Texas 1

Sunday, June 11:

Arizona State 2, South Carolina 1
Stanford 5, Texas 4 (Stanford wins 3-0)

Monday, June 12:

South Carolina 4, Arizona State 0 (South Carolina wins 3-1)

This one's somewhat unusual in that the top two teams meet in the title game; that only happened about 10% of the time over the 1000-tournament run.

Round 4:

Thursday, June 15:

Stanford 4, South Carolina 0

Friday, June 16:

South Carolina 4, Stanford 2

Saturday, June 17:

South Carolina 5, Stanford 3

Sunday, June 18:

Stanford 8, South Carolina 2

Monday, June 19:

South Carolina 4, Stanford 3 (South Carolina wins 3-2)

The final game is played on Monday night. I realize that that's prime time, but I also realize that we're talking about mid-June here, when ratings are down and the networks are well-served by showing distinctive programs amidst all the reruns. In cases where the final series goes five games, the network gets the weekend buildup from games 3 and 4 building the tension toward a big finale. If it doesn't go five, they just plop in another couple of Everybody Loves Raymond reruns and no one gets hurt.

The biggest objection I've seen raised so far is that this is too intense a schedule for college pitching staffs. The potential is there for a team to play 20 games in 26 days, but there are two reasons I'm not all that concerned about that. First of all, it's extremely unlikely -- it never happened in a thousand runs of the simulator. Second, while no one actually played quite that intense a schedule this year, five games a week is not at all unusual (quite a few teams did it for three weeks in a row this year) and schedules are actually less intense now than some coaches would like them to be because of game limitations. Similarly, ten games in eleven days in Omaha is a bit shaky, but it's necessary to avoid too much down time for the teams and fans between the two series. That part of the schedule can be tweaked, probably by starting the Final Four a day or two earlier and letting the fans go to the SAC museum for a day or something.

What's Your Idea?

One of the strongest argument for this format that I can make, though, is that there just aren't enough games now between the best teams in the country. Things are improving a bit because the big programs are working harder to play inter-regional games early in the year, but there's just not enough of those marquee non-conference matchups for my taste. The postseason format now doesn't help much with that. This year's tournament, for example, featured only nineteen games between teams in the ISR top sixteen. The above example in my proposed format has sixty-two.

I realize that the devil is certainly in the details, and that the selection process is a large potential weakness of this plan. The system only works if all the candidates for "best team" actually make it into the tournament, and the committee has shown infinite capacity for mistake in the past. Since I'm just dreaming about them moving to this plan, I'll pretend that if they're smart enough to use the modified format, they're smart enough to choose the teams using some reasonable method like the ISR or some better numerical method. The down side to that is that they'd probably consider something like the RPI's better.

I'd love to hear other ideas for formats, no matter how offbeat, so let me hear from you. If I get enough response, I'll go over some of the more interesting suggestions in a future column.

In "In the Beginning Was the Command Line", Neil Stephenson talks about structures built incrementally over a period of many years and the problems that can result from such practices. He was talking about actual physical buildings and computer operating systems, but his points hold just as well for processes and events, and they explain just fine what's happened to the NCAA baseball tournament. Let's tear this thing down and rebuild with a good, clean design.

Boyd's World-> Breadcrumbs Back to Omaha-> A Better Format (Part 2) About the author, Boyd Nation