Methodology
How Priors turns raw NFL data into player ratings, team rankings, and projections — explained visually, no math required.
The Big Picture
From raw play-by-play data to the rankings and projections you see on every page — here's the full pipeline.
Data Sources
66K+ plays/season from 25 years of NFL history, weather, rosters, and betting markets.
Design Matrix
Sparse player-play matrix (66K × 2.5K) with 21 context features per play.
Bayesian Model
11.7K paramsHierarchical model with 11,722 parameters — players, teams, context, and era offsets.
Posterior Sampling
GPUChEES-HMC explores the probability landscape with 64 parallel chains on GPU.
Metrics & Rankings
EPA ratings, WAR, power rankings, game predictions, and draft valuations.
Data Sources
Every prediction starts with data. Here's what feeds into the model.
Play-by-Play
Every snap from every game — play type, result, EPA, field position, down & distance. 66,000+ plays per season.
Player Participation
Which 22 players were on the field for each snap. Full participation data from 2016+, synthetic fallback for earlier seasons.
Roster & Contract
Player positions, age, draft capital, combine metrics, and salary cap information used for SPM priors.
Weather & Venue
Temperature, wind speed, precipitation, dome vs open-air, grass vs turf. Normalized and encoded as context features.
Betting Markets
Point spreads and over/unders provide market-consensus team strength priors. Calibrated at 0.04 EPA per point.
Historical Record
25 seasons of NFL data with consistent EPA framework. Era offsets account for rule changes over time.
The Model
A hierarchical Bayesian model decomposes every play into player talent, team strength, game context, and era effects.
Position Group Priors
Each position group (QB, RB, WR, TE, OL, DL, LB, DB) has its own talent distribution — how good is the average player at this position, and how much do they vary?
Player Effects
Each of 2,483 players gets a personal talent rating — regularized by their position group so we don't overreact to small samples.
Team Offense & Defense Effects
Market-Informed64 team-season ratings (32 teams × offense/defense) capture scheme quality, coaching, and roster-wide synergies beyond individual players.
Context Adjustments
21 situational factors that affect every play — the model learns how much each one matters.
Era Offsets
Rule changes shift league-wide scoring. Four eras are modeled to keep historical comparisons fair.
The sum of player talent + team quality + game context + era baseline = how many points we expect this play to produce, compared to league average.
The Inference Engine
How do we actually estimate 11,722 parameters at once? By intelligently exploring the probability landscape.
L-BFGS Optimizer
Start by finding the single most likely set of parameter values — the peak of the probability landscape.
Pathfinder Warmup
Compute a mass matrix from the curvature at the peak — tells the sampler which directions to explore faster or slower.
ChEES-HMC Sampling
64 chains64 independent chains explore the landscape simultaneously, bouncing around like billiard balls to map out all plausible parameter values.
Posterior Samples
32K samples32,000 complete snapshots of all parameters — representing the full range of plausible values, not just a single best guess.
Why Bayesian?
One number. Is Patrick Mahomes "worth" 0.21 EPA/play? You get a point estimate, but no idea how confident to be.
A full distribution. Mahomes is probably around 0.21, but could be 0.16–0.26. We quantify the uncertainty, which flows into every ranking, projection, and prediction.
From Samples to Stats
32,000 posterior samples are distilled into the metrics, rankings, and predictions you see across the platform.
Player EPA Ratings
Per-play effect with 95% credible interval and rank bounds (best/worst plausible rank).
WAR (Wins Above Replacement)
EPA above the 20th-percentile player at each position, converted to wins (1 win = 30 EPA).
Power Rankings
Team offensive and defensive strength with uncertainty — drives game-day predictions.
Game Predictions
Win probabilities and point spreads from team effects + context + home-field. Updated weekly.
Season Simulations
10,000 Monte Carlo simulations of the remaining season — playoff odds, draft position, divisional races.
Draft & Trade Analysis
Prospect valuations from age curves, position priors, and historical comps. Trade calculator uses posterior WAR.
Scheme-Adjusted Rankings
Separates true talent from coaching/system boost using the System Product Index (SPI).
Uncertainty Diagnostics
Calibration curves, cross-validation R², posterior convergence stats, and model validation.
Glossary
Key terms in plain English.
How many points a play added compared to what an average team would have produced in the same situation. Positive = good, negative = bad.
Total wins a player adds over a replacement-level player (20th percentile at their position). Converts EPA into a single number reflecting overall value.
Our best estimate of a value after seeing all the data. Not a single number, but a full range of plausible values with their likelihoods.
The range where we're 95% confident the true value lives. Narrower = more certain. Shown as the shaded band around estimates.
What we expected before seeing the data. Informed by position group averages, observable traits (SPM), and betting markets.
Players are grouped by position — the group average informs individual estimates. A QB with 50 plays is pulled toward the QB average, not the league average.
The sampling algorithm that efficiently explores high-dimensional probability spaces. Think of it as a marble rolling on a curved surface — physics helps it explore.
The gap between a player's raw EPA and their Bayesian true talent. Positive SPI = the scheme is boosting them. Negative = they're better than they look.
Technical details — The model is implemented in Rust with GPU-accelerated gradient computation via Vulkan. Inference runs on a custom ChEES-HMC sampler with Pathfinder initialization, diagonal Hessian mass matrix, and f64 precision throughout. The full codebase is continuously validated against held-out seasons with Brier score calibration, spread MAE benchmarks, and posterior predictive checks.