Past Big Data Bowl Recaps

Recap of the NFL's first and second annual Big Data Bowl competitions.


Congratulations to the winners of the NFL’s inaugural Big Data Bowl. Finalists presented their findings to members of the NFL league office, team executives, industry-leading representatives and league sponsors at the NFL Combine. The two grand prize winners received four tickets to a 2019 regular season NFL game and a $1,000 gift card.



Matthew Reyers, Dani Chu, Lucas Wu, James Thomson, Simon Fraser University  Routes to Success

  • The group modeled play success rate and expected points under various passing route combinations. Using a technique called model-based clustering, the group found several complementary pass route patterns that could consistently yield positive outcomes, even when accounting for defensive formation and behavior.
  • Key Stat: Through effective pass route combinations, an offense could control roughly 70% of the field.


Nathan Sterken  RouteNet: a convolutional neural network for classifying routes

  • Sterken treated receiver routes as an image recognition problem, using a neural network to categorize each route. Once grouped, these patterns were compared to win probability added (the change in the offensive team’s chance of winning the game before and after the play).
  • Key Stat: The flat-in-post route, a staple of the Steve Spurrier days at the University of Florida, was the best three-receiver route combination.

Recap of the first-ever Big Data Bowl.



Kyle Burris, Duke University  A trajectory planning algorithm for quantifying space ownership in professional football

  • Burris used metrics like player speed, direction and acceleration to chart the space occupied by the 22 players on the field. One example highlighted a 64-yard touchdown pass from Derek Carr to Johnny Holton to show how Holton’s speed and direction indicated that he was moving towards an open space well before he looked open on the field.
  • Key Stat: In the play above, Carr released the pass only 0.3 seconds after Holton beat his defender, suggesting the quarterback knew he was going to Holton before the receiver broke free.

Peter Wu, Brendon Gu, Carnegie Mellon University  DIRECT: A two-level system for defensive interference rooted in repeatability, enforceability, clarity, and transparency

  • Wu and Gu merged statistical modeling techniques with potential changes in penalty calls and receiver catch probability to consider new standards for defensive pass interference and defensive holding.
  • Key Stat: There appear to be two peaks in catch probabilities on pass interference calls, about 55% and 75%, which suggests the feasibility of a two-level foul system.

Jack Soslow, Jake Flancer, Eric Dong, Andrew Castle, University of Pennsylvania  Using autoencoded receiver routes to optimize yardage

  • The group presented three unique ways to represent pass route data, including time series and shape-based clustering. Merging in-play and game-specific traits, the group suggested that hitch routes are generally an underused strategy for increasing efficiency.
  • Key Stat: Roughly one in three Odell Beckham pass routes was classified as a 10-yard curl pattern.


Sameer Deshpande, Katherine Evans  Expected hypothetical completion probability

  • Deshpande and Evans tracked receiver catch probability across entire pass routes. Their approach allows for an estimation of the receiver’s performance regardless of when and where the pass was thrown.
  • Key Stat: On an 18-yard touchdown pass to Cooper Kupp during Week 1 of the 2017 season, Rams quarterback Jared Goff had another receiver that was more open than Kupp. Roughly 1.5 seconds into his drop-back, had Goff thrown to Robert Woods, the pass would have had a 92% catch probability.

Cathy Ha, Lucas Calestini  Efficient speed usage and the impact of fatigue in speed performance: an exploratory study

  • Ha and Calestini looked at the link between play specific factors such as play type, game factors (home vs. away, game surface, weather), and player fatigue (rest since last play, intensity of last play) and their impact on player speed efficiency.
  • Key Stat: Alvin Kamara had the highest speed efficiency of any ball carrier during the first 6 weeks of the 2017 season.

Adam Vonder Haar  Exploratory data analysis of passing plays using NFL tracking data

  • Vonder Haar classified routes and defensive space allocation to identify which route combinations yielded the most open receivers. His approach using convex hulls to characterize defender spacing was particularly novel.
  • Key Stat: The receiver generating the highest maximum separation was only targeted on about one in five plays in the first six weeks of the 2017 season.

2020 Big Data Bowl Recap

The second annual Big Data Bowl, powered by Amazon Web Services (AWS), gathered analytically-minded fans across the world to predict the outcomes of rushing plays during the 2019 season.

Next Gen Stats data — which includes location, speed, acceleration, and velocity for all players on the field — was one of several data sets that are available to participants.

More than 2,000 data scientists from all over the world competed on Kaggle to predict rushing play outcomes during Weeks 13 through 17 of the 2019 regular season. Submissions to the 2020 contest were tracked using a leaderboard that updated weekly, with the participants who most accurately predicted how many yards rushers would gain winning the competition. 

2020 BIG DATA BOWL Winner

2020 Big Data Bowl winner Dmitry Gordeev speaking in Indianapolis.

2020 Big Data Bowl winner Dmitry Gordeev speaking in Indianapolis.

Data scientists Philipp Singer and Dmitry Gordeev from Austria captured the top prize of $50,000 with their highly technical approach in the NFL’s second annual Big Data Bowl.

Singer and Gordeev comprise The Zoo, a data science team based in Austria. Singer studied software engineering and economics in Austria, where he finished a Ph.D. in computer science. Gordeev studied at Moscow State University and graduated as a specialist in applied math/data mining. Both work at UNIQA Insurance Group in Vienna and have teamed up to win six gold medals on Kaggle.

2020 BIG DATA BOWL Finalists 

For the 2020 competition, the top submissions shared $75,000 in prizes, and top contestants presented their work to NFL club analytics staff at the NFL Combine in Indianapolis. Below is a summary of each presentation and a link to the complete entry.


Matt Ploenzke, Harvard


Kellin Rumsey, Brandon DeFlon, University of New Mexico

Graham Pash, Walker Powell, NC State

Namrata Ray, Jugal Marfatia, Washington State University

Alex Stern, University of Virginia

Caio Brighenti, Colgate University