Methodology

How the numbers are computed

Every number on this site is computed deterministically from the underlying source data. AI-generated text is labeled and validated by a separate model pass.

01 Aggregation methods

Every chart can switch between 6 methods. None are 'right' — each makes different tradeoffs.

Simple mean
Unweighted arithmetic average of polls in the rolling window. Every poll counts the same. Noisy but transparent.
Weighted
Weights = sample size √(n/600) × recency exp(-age × ln(2)/14d) × pollster accuracy weight (1.0 default; up to 4.0×).
House-corrected
Same as Weighted, but each pollster's measured signed bias is subtracted from each result before averaging. Production default for race pages.
Trimmed
Same weights as Weighted, but the highest and lowest 10% of polls per anchor date are dropped. Robust to outliers.
LOESS
Local polynomial regression (frac=0.3). Smoother visual trend, less reactive to single polls. Requires ≥3 polls per candidate.
Kalman state-space
Models the true value as a latent random walk and polls as noisy observations. The smoothest line — closest to what 538 / Silver Bulletin show.
02 Race ratings +

Each race gets one of: Safe / Likely / Lean / Tilt × D/R, plus Tossup. Source priority: polls → markets → PVI baseline.

Margin |D−R|Rating
≥ 15 ptSafe
5–15 ptLikely
2–5 ptLean
0.5–2 ptTilt
< 0.5 ptTossup

When no recent polls exist, we fall back to (a) prediction market implied probability or (b) the state's Cook PVI — one rating bucket more conservative than the PVI suggests.

03 Pollster scorecards +
  • Race scorecard — for polls within 21 days of an election with a known outcome, compute implied two-way margin (R% − D%) − actual margin = error. Mean error is the pollster's accuracy; signed mean error is the bias.
  • Topic scorecard — for topic polls (approval, generic ballot, etc.) where there's no "outcome", compare each poll to the rolling consensus of all other pollsters on the same topic within ±30 days. The deviation is the pollster's house effect.

Pollsters with mean_error ≤ 4pt and ≥3 scored polls get an aggregation weight = 4 / mean_error, clipped to [0.25, 4.0].

04 Chamber control probability +

Monte Carlo, 20,000 simulations. Each race draws independently from its rating-implied D-win probability:

RatingP(D wins)
Safe D99%
Likely D92%
Lean D78%
Tilt D60%
Tossup50%
Tilt R40%
Lean R22%
Likely R8%
Safe R1%
Caveat · info

Independent draws over-state polarization in close years. Real elections have national waves — a +3 R cycle moves all races, not each one independently. A correlated-swing model is on the roadmap.

05 Backtest results +

Train on past cycles, test on a held-out cycle:

Train → TestMethodMAEBias
2020+22 → 2024Unweighted4.27pt−3.06pt
2020+22 → 2024Weighted4.24pt−3.11pt
2020+22 → 2024House-corrected3.91pt−2.41pt
2020+22 → 2024Weighted + House-c.3.76pt−2.47pt
Headline · warning

2024 had ~3pt industry-wide D-bias on final-stretch polls (25 of 30 backtested races over-predicted Democrats). House-effect correction recovers ~0.6pt of that. Cycle-level bias is the residual unmodeled error.

06 AI-generated content +

Daily reports, race summaries, topic summaries, state briefings, ballot-measure explainers, weekly digests, and glossary entries are generated by Claude (via the local CLI). Each piece goes through a separate validator pass that checks every claim against the source data. Content that fails validation is still shown but flagged with an unvalidated caution alert — never silently published.