Notes
On prediction intervals, uncertainty quantification, the geometry of regression, and multi-agent AI.
Each note has three levels of depth — intuitive, technical, and advanced — so you can read at whatever level matches your background.
coverage
Conformal Prediction
Distribution-free prediction intervals with finite-sample guarantees — from the basic recipe to its fundamental limitations and existing attempts to overcome them.
- Your Model Is Confident. Should You Be? Why point predictions are incomplete, what prediction intervals actually are, and why constructing them correctly is harder than it looks.
- Conformal Prediction The split conformal recipe: train, calibrate, quantile, done. A finite-sample coverage guarantee for any model, any distribution.
- The Constant-Width Problem Marginal versus conditional coverage, the impossibility theorem, and why constant-width intervals hide dangerous unevenness.
- Heteroscedasticity and Variance Stabilization When prediction difficulty varies, raw residuals are not comparable. Variance-stabilizing transformations and weighted nonconformity scores.
- Adaptive Conformal Methods CQR, Studentized CP, and Localized CP — what each gets right, what each gets wrong, and the gap that remains.
- The Origins of Conformal Prediction From Kolmogorov’s foundations through Vovk’s transductive framework to the modern split method — how conformal prediction came to be.
- Beyond the Split Full conformal, cross-conformal, and jackknife+ — recovering statistical efficiency without giving up finite-sample coverage.
- When Exchangeability Breaks Distribution shift, non-stationarity, and feedback loops. What happens to conformal guarantees when the real world violates the one assumption we need.
influence
Leverage Scores
The hat matrix, leverage scores, and a geometric perspective on prediction uncertainty — a classical tool that tells you where your model is extrapolating, for free.
- The Hat Matrix How the diagonal of a projection matrix tells you exactly how unusual each data point is, and why that matters for prediction.
- Leverage and Influence Classical regression diagnostics, Cook’s distance, and why randomized algorithms make leverage scalable to modern datasets.
- The Sign Flip Training residuals have variance σ²(1−h); prediction errors have variance σ²(1+h). The sign of leverage flips, and most methods get this wrong.
- Leverage as a Free Lunch Closed-form, model-free, computationally negligible, and immune to the sign flip. Everything you need is already in the design matrix.
- A Brief History of Leverage From regression diagnostics in the 1970s to randomized algorithms — how leverage scores went from a statistical curiosity to a computational workhorse.
- Randomized NLA: Why Leverage Scores Are the Right Sampling Distribution Leverage score sampling gives optimal row sketches for least squares, matrix approximation, and beyond. The theory of Drineas, Kannan, and Mahoney.
- Leverage in High Dimensions Ridge leverage, kernel leverage, and neural tangent leverage — extending the hat matrix beyond classical OLS to modern high-dimensional models.
collective
Agentic AI
Multi-agent architectures, ensemble theory, and the mathematics of LLM collaboration — from Condorcet's Jury Theorem to Mixture of Agents and beyond.
- Why One Model Isn't Enough Ensemble theory, the bias-variance-diversity decomposition, and Condorcet's Jury Theorem — the theoretical case for multi-agent systems.
- From Mixture of Experts to Mixture of Agents Three decades from MoE through sparse gating and Switch Transformers to Together AI's layered MoA architecture.
- The Aggregation Problem Arrow's impossibility theorem, Condorcet cycles, LLM-as-Judge, and why synthesis sidesteps social choice impossibilities.
- Reasoning Architectures Chain-of-Thought, ReAct, Tree of Thoughts, Graph of Thoughts, and GPTSwarm — the progression from linear to graph-structured reasoning.
- Agentic RAG Self-RAG, Corrective RAG, Adaptive-RAG, and KARMA — turning retrieval from a one-shot lookup into an active research process.
- Scaling Laws for Multi-Agent Systems When more agents help, when they don't, communication topologies, the efficiency frontier, and the open problems ahead.