A $15,000 prize for best NCAA basketball picks
How's your NCAA basketball bracket doing?
Probably not as well as Michael Lopez's. An assistant professor of statistics at Skidmore College, he and a partner won $15,000 for making the best picks in last year's NCAA Men's Basketball Tournament, beating out more than 200 other teams of data scientists who were intrigued by the challenge of coming up with the best mathematical approach.
As they explain in an article recently published by the Journal of Quantitative Analysis in Sports, they found a way to extend through the entirety of the tournament the point spreads that Las Vegas bookies assigned to all 32 games in the tournament's first round.
Their algorithm used just two data sources for each game: the Las Vegas point spreads for the games and a set of offensive and defensive efficiency ratings developed by Ken Pomeroy, an independent basketball analyst.
Lopez and his colleague, Gregory Matthews, assistant professor of statistics at Loyola University Chicago, looked at 10 years of college basketball results, regular season and tournament games alike. Using a statistical technique called logistic regression, they turned point spreads into an estimated probability that team A will beat team B. In this way, Lopez says, "we were able to develop a model to predict the point spread for games that were yet to be played" and win the prize from Kaggle, a site that hosts data competitions.
"Using the right data was much more important to our models' performing well than using more sophisticated models," Lopez notes. "Such factors as excess travel affect performance, but we didn't have to include them because they're already baked into the Vegas numbers."
While few fans are likely to employ a technique as fancy as logistic regression to fill out their brackets, there are some simple statistical rules to live by in filling out a bracket, says Lopez. He spells them out on his blog, StatsbyLopez. For more on their approach, see this report in The New York Times.
Kaggle is offering another $15,000 prize to be awarded to the data scientists who come up with the best statistical approach for predicting this year's NCAA tournament. Lopez and Matthews have entered, using the same approach as last year.
Who's their pick to go all the way? Kentucky, of course. "It's a more likely champion than in most years, but still not much different than a coin flip," says Lopez.