How the numbers are computed
Every number on this site is computed deterministically from the underlying source data. AI-generated text is labeled and validated by a separate model pass.
01 Aggregation methods −
Every chart can switch between 6 methods. None are 'right' — each makes different tradeoffs.
- Simple mean
- Unweighted arithmetic average of polls in the rolling window. Every poll counts the same. Noisy but transparent.
- Weighted
- Weights = sample size √(n/600) × recency exp(-age × ln(2)/14d) × pollster accuracy weight (1.0 default; up to 4.0×).
- House-corrected
- Same as Weighted, but each pollster's measured signed bias is subtracted from each result before averaging. Production default for race pages.
- Trimmed
- Same weights as Weighted, but the highest and lowest 10% of polls per anchor date are dropped. Robust to outliers.
- LOESS
- Local polynomial regression (frac=0.3). Smoother visual trend, less reactive to single polls. Requires ≥3 polls per candidate.
- Kalman state-space
- Models the true value as a latent random walk and polls as noisy observations. The smoothest line — closest to what 538 / Silver Bulletin show.
02 Race ratings +
Each race gets one of: Safe / Likely / Lean / Tilt × D/R, plus Tossup. Source priority: polls → markets → PVI baseline.
| Margin |D−R| | Rating |
|---|---|
| ≥ 15 pt | Safe |
| 5–15 pt | Likely |
| 2–5 pt | Lean |
| 0.5–2 pt | Tilt |
| < 0.5 pt | Tossup |
When no recent polls exist, we fall back to (a) prediction market implied probability or (b) the state's Cook PVI — one rating bucket more conservative than the PVI suggests.
03 Pollster scorecards +
- Race scorecard — for polls within 21 days of an election with a known outcome, compute implied two-way margin (R% − D%) − actual margin = error. Mean error is the pollster's accuracy; signed mean error is the bias.
- Topic scorecard — for topic polls (approval, generic ballot, etc.) where there's no "outcome", compare each poll to the rolling consensus of all other pollsters on the same topic within ±30 days. The deviation is the pollster's house effect.
Pollsters with mean_error ≤ 4pt and ≥3 scored polls get an aggregation weight = 4 / mean_error, clipped to [0.25, 4.0].
04 Chamber control probability +
Monte Carlo, 20,000 simulations. Each race draws independently from its rating-implied D-win probability:
| Rating | P(D wins) |
|---|---|
| Safe D | 99% |
| Likely D | 92% |
| Lean D | 78% |
| Tilt D | 60% |
| Tossup | 50% |
| Tilt R | 40% |
| Lean R | 22% |
| Likely R | 8% |
| Safe R | 1% |
Independent draws over-state polarization in close years. Real elections have national waves — a +3 R cycle moves all races, not each one independently. A correlated-swing model is on the roadmap.
05 Backtest results +
Train on past cycles, test on a held-out cycle:
| Train → Test | Method | MAE | Bias |
|---|---|---|---|
| 2020+22 → 2024 | Unweighted | 4.27pt | −3.06pt |
| 2020+22 → 2024 | Weighted | 4.24pt | −3.11pt |
| 2020+22 → 2024 | House-corrected | 3.91pt | −2.41pt |
| 2020+22 → 2024 | Weighted + House-c. | 3.76pt | −2.47pt |
2024 had ~3pt industry-wide D-bias on final-stretch polls (25 of 30 backtested races over-predicted Democrats). House-effect correction recovers ~0.6pt of that. Cycle-level bias is the residual unmodeled error.
06 AI-generated content +
Daily reports, race summaries, topic summaries, state briefings, ballot-measure explainers, weekly digests, and glossary entries are generated by Claude (via the local CLI). Each piece goes through a separate validator pass that checks every claim against the source data. Content that fails validation is still shown but flagged with an unvalidated caution alert — never silently published.