You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Separate effort (not scheduled here) tracking the fix for register C-32 (MAP bias) and C-33 (HDI has no nesting/tower guarantee) — one shared problem. Surfaced by a views-faoapi integration spike (2026-06-23).
Symptom
C-32 (MAP).map_estimate breaks histogram-peak ties on the lowest bin index (point.py:100, np.argmax(counts)) — the C-24 portability fix. On real conflict posteriors (right-skewed, zero-inflated, heavy-tailed) the peak is often a multi-way tie, and lowest-index = leftmost = smallest value, so the mode is pulled toward zero (measured at the 32-draw speed setting: ~21% of active cells, one-directional, up to 7.9 ln; the magnitude shrinks at production sample counts but the bias direction and the non-convergence persist).
C-33 (HDI tower).hdi(frame, mass) returns one mass's shortest interval per call and stops — no nesting guarantee across masses, no multi-mass API. On skewed/multimodal empirical samples a narrower band can poke outside a wider one.
(The raw shortest-interval HDI is otherwise bit-identical to the consumer's — the divergence is purely the post-hoc enforcement the consumer adds and we don't.)
Why this is hard (not just a tie-break)
The mode is the only one of our estimates that is a functional of the density, not the CDF:
mean = an average; quantiles/HDI = order statistics → need only the samples ranked. That's why HDI is bit-identical.
mode = argmax of an estimated density → density estimation is inherently regularized; there is no assumption-free, tuning-free density estimate.
What we ship is the degenerate corner: a nonparametric mode with a fixed (non-adaptive) bandwidth (100 bins) + an arbitrary tie-break — neither parametric nor consistently nonparametric, so it is both biased and non-convergent. A fixed bin count cannot converge no matter how many samples are added. So more samples alone is necessary, not sufficient. (The estimator is already semi-parametric: the zero_mass_threshold rule is a zero-inflation model; the under-determined part is the continuous-body mode.)
Production data-flow — the operating context (the FAO / rusty_bucket path)
The domain is conflict forecasting: zero-inflated, right-skewed, heavy-tailed posteriors — the platform's data-generating process, not one user's slice. The FAO-bound ensemble (in views-models, working name rusty_bucket, still in tuning) is being designed to pool all constituent models' posterior draws (≥128/model × ~8 models ≈ 1024 draws/cell) and ship the full pooled sample arrays downstream — not a pre-collapsed _best scalar (as some existing ensembles do). So views_frames_summarize receives the genuine pooled mixture and performs the single principled MAP/HDI collapse once, at the point of consumption. Consequences for this issue:
The estimator's input is a real 8-component mixture ⇒ multimodality is first-class, not an edge case — the dominant-mode-or-declare-non-unique contract (below) is the production path.
The operating regime is n ≈ 128–1024; 32 is only the autoresearch speed knob, never the design point.
Target / acceptance property (what "correct" means here)
HDIs are nested superlevel sets: HDI_α = {x : f(x) ≥ c_α}, c_α decreasing in α, so HDI_α ⊆ HDI_β for α < β; the point in the intersection of all of them is the mode. So for a single-dominant-mode sample set, define the MAP as the shrinking-HDI limit, MAP := lim(α→0) HDI_α. Then MAP ∈ HDI_α for all α holds by construction — assumption-light, no parametric family, reusing the HDI machinery.
Multimodal / mixed inputs make a scalar MAP ill-posed — the narrow HDI is disjoint points; a zero-atom forces MAP → 0. The estimator must then make an explicit choice — pick the dominant (most-mass) mode and say so, or report the mode is non-unique — never silently return a number.
Keep two notions distinct: the definition is the α→0 limit (it pins the estimator); the guard/test checks MAP ∈ HDI at moderate α (50–95%), since the literal α→0 HDI is empirically degenerate.
The HDI tower is the same coherence (C-33 ⇔ C-32)
Nesting and "MAP ∈ HDI" are one statement about density level sets. {f ≥ c} only grows as c falls ⇒ the family is nested by construction, and the mode is the top set. The consumer's analyzer imitates this post-hoc — expand each wider interval to contain the narrower, then shift the narrowest to cover the MAP — which makes the intervals no longer the true shortest and couples the HDI to the (biased) MAP. The right move is to derive the whole family from one coherent object (a smoothed density's level sets, or the shrinking shortest-interval), so the tower and the mode fall out mutually consistent.
API implication: a multi-mass hdi(frame, masses=[…]) returning a guaranteed-nested(N, len(masses), 2) tower (and, if wanted, the MAP as its α→0 limit).
Candidate fixes
Leading candidate — shrinking-HDI / half-sample mode (reuses code we already trust).
The constructive realization of the target property: at small-but-finite mass the shrinking shortest-interval is a recognized data-adaptive mode estimator — the shorth and the half-sample / Robertson–Cryer mode (recurse on the shortest interval holding a fraction of the data). Advantages: no fixed-bin bandwidth (width adapts to local density), reuses the bit-identical HDI machinery, and run at every α it yields the nested tower for C-33 too. Catch: as α→0, k = ⌊α·n⌋ → 0 → 1–2 samples (noisy); stop at small-but-finite and recurse.
Other routes
(A) Distributional assumption — fit a family → analytic mode (+ analytic nested HDIs). Stable at low n; cost = model misspecification.
(B) n-adaptive smoothing + sufficient-n floor — bins/bandwidth shrinking with n (e.g. h ∝ n^(−1/5)); level sets give nesting.
A better tie-break or a post-hoc expand-to-nest only patches symptoms — band-aids, not cures.
Cheap diagnostic / possible interim guard (independent of the full fix)
For a unimodal density the mode must lie inside every HDI: MAP ∈ HDI_α. A MAP outside a moderate (50–95%) HDI is a self-evident sign the estimator isn't tracking density — a cheap, shippable guard/regression test (warn, don't hard-raise; scope to unimodal). Likewise a cheap nesting check across a mass-family flags C-33 violations.
How "better" gets defined (so autoresearch is safe to point at this)
Because there is no ground truth on real posteriors, "better" must be an operational definition on a pre-registered synthetic benchmark, centred on the conflict shape (zero-inflated, right-skewed, heavy-tailed, mixtures), swept over n ∈ {32(fast), 128, 1024} and parameters, with a held-out split for generalization. FAO's real data is an end validation check only, never a training signal. The composite metric is feasibility-first (lexicographic): hard constraints (MAP ∈ HDI; consistency — error shrinks as n grows; vectorized within the memory budget) gate, then minimize bias + variance + stability-drift among feasible estimators — so the loop cannot trade away coherence/convergence/cost. Run the incumbent (PosteriorDistributionAnalyzer mode) through the same benchmark to get a baseline number: "better" = beats it; if nothing does, the incumbent is accepted with evidence.
Constraints / decision (needs-decision)
SemVer: changing map_estimate's output and/or adding a nested-HDI API is a behavior/surface change. Per GOVERNANCE/ADR-018 (frozen v1.0): MINOR-with-deprecation vs MAJOR vs opt-in (method= / new masses= API)?
Scope guard (ADR-017): stays a sample-axis summarizer — no IO/domain/scoring.
Acceptance (when scheduled)
An estimator satisfying the target property (MAP ∈ HDI by construction for single-dominant-mode; explicit dominant-mode-or-declare-non-unique otherwise), with a stability and bias rationale — not a tie-break swap.
A nested-by-construction HDI family + a multi-mass hdi(frame, masses=[…]) API (closes C-33), with a nesting law in the conformance/property tests.
The MAP ∈ HDI guard; the pre-registered synthetic benchmark + feasibility-first metric as a versioned artifact; parity/scale tests; CHANGELOG + register C-32/C-33 resolution.
References
Evidence: a views-faoapi integration spike (parity test + audit plots), 2026-06-23 — held in the consumer repo.
Separate effort (not scheduled here) tracking the fix for register C-32 (MAP bias) and C-33 (HDI has no nesting/tower guarantee) — one shared problem. Surfaced by a views-faoapi integration spike (2026-06-23).
Symptom
map_estimatebreaks histogram-peak ties on the lowest bin index (point.py:100,np.argmax(counts)) — the C-24 portability fix. On real conflict posteriors (right-skewed, zero-inflated, heavy-tailed) the peak is often a multi-way tie, and lowest-index = leftmost = smallest value, so the mode is pulled toward zero (measured at the 32-draw speed setting: ~21% of active cells, one-directional, up to 7.9 ln; the magnitude shrinks at production sample counts but the bias direction and the non-convergence persist).hdi(frame, mass)returns one mass's shortest interval per call and stops — no nesting guarantee across masses, no multi-mass API. On skewed/multimodal empirical samples a narrower band can poke outside a wider one.(The raw shortest-interval HDI is otherwise bit-identical to the consumer's — the divergence is purely the post-hoc enforcement the consumer adds and we don't.)
Why this is hard (not just a tie-break)
The mode is the only one of our estimates that is a functional of the density, not the CDF:
What we ship is the degenerate corner: a nonparametric mode with a fixed (non-adaptive) bandwidth (100 bins) + an arbitrary tie-break — neither parametric nor consistently nonparametric, so it is both biased and non-convergent. A fixed bin count cannot converge no matter how many samples are added. So more samples alone is necessary, not sufficient. (The estimator is already semi-parametric: the
zero_mass_thresholdrule is a zero-inflation model; the under-determined part is the continuous-body mode.)Production data-flow — the operating context (the FAO /
rusty_bucketpath)The domain is conflict forecasting: zero-inflated, right-skewed, heavy-tailed posteriors — the platform's data-generating process, not one user's slice. The FAO-bound ensemble (in
views-models, working namerusty_bucket, still in tuning) is being designed to pool all constituent models' posterior draws (≥128/model × ~8 models ≈ 1024 draws/cell) and ship the full pooled sample arrays downstream — not a pre-collapsed_bestscalar (as some existing ensembles do). Soviews_frames_summarizereceives the genuine pooled mixture and performs the single principled MAP/HDI collapse once, at the point of consumption. Consequences for this issue:views-modelsbakes_bestfor some paths,views-faoapire-derives a histogram-mode for others, and multimodality is handled in none of them. Pooled-samples + one principled summarizer centralizes it. (Producer side tracked in theviews-modelsrusty_bucketissue.)Target / acceptance property (what "correct" means here)
HDIs are nested superlevel sets:
HDI_α = {x : f(x) ≥ c_α},c_αdecreasing in α, soHDI_α ⊆ HDI_βfor α < β; the point in the intersection of all of them is the mode. So for a single-dominant-mode sample set, define the MAP as the shrinking-HDI limit,MAP := lim(α→0) HDI_α. ThenMAP ∈ HDI_αfor all α holds by construction — assumption-light, no parametric family, reusing the HDI machinery.MAP → 0. The estimator must then make an explicit choice — pick the dominant (most-mass) mode and say so, or report the mode is non-unique — never silently return a number.MAP ∈ HDIat moderate α (50–95%), since the literal α→0 HDI is empirically degenerate.The HDI tower is the same coherence (C-33 ⇔ C-32)
Nesting and "MAP ∈ HDI" are one statement about density level sets.
{f ≥ c}only grows ascfalls ⇒ the family is nested by construction, and the mode is the top set. The consumer's analyzer imitates this post-hoc — expand each wider interval to contain the narrower, then shift the narrowest to cover the MAP — which makes the intervals no longer the true shortest and couples the HDI to the (biased) MAP. The right move is to derive the whole family from one coherent object (a smoothed density's level sets, or the shrinking shortest-interval), so the tower and the mode fall out mutually consistent.hdi(frame, masses=[…])returning a guaranteed-nested(N, len(masses), 2)tower (and, if wanted, the MAP as its α→0 limit).Candidate fixes
Leading candidate — shrinking-HDI / half-sample mode (reuses code we already trust).
The constructive realization of the target property: at small-but-finite mass the shrinking shortest-interval is a recognized data-adaptive mode estimator — the shorth and the half-sample / Robertson–Cryer mode (recurse on the shortest interval holding a fraction of the data). Advantages: no fixed-bin bandwidth (width adapts to local density), reuses the bit-identical HDI machinery, and run at every α it yields the nested tower for C-33 too. Catch: as
α→0,k = ⌊α·n⌋ → 0→ 1–2 samples (noisy); stop at small-but-finite and recurse.Other routes
n; cost = model misspecification.n-adaptive smoothing + sufficient-nfloor — bins/bandwidth shrinking withn(e.g.h ∝ n^(−1/5)); level sets give nesting.A better tie-break or a post-hoc expand-to-nest only patches symptoms — band-aids, not cures.
Cheap diagnostic / possible interim guard (independent of the full fix)
For a unimodal density the mode must lie inside every HDI:
MAP ∈ HDI_α. A MAP outside a moderate (50–95%) HDI is a self-evident sign the estimator isn't tracking density — a cheap, shippable guard/regression test (warn, don't hard-raise; scope to unimodal). Likewise a cheap nesting check across a mass-family flags C-33 violations.How "better" gets defined (so autoresearch is safe to point at this)
Because there is no ground truth on real posteriors, "better" must be an operational definition on a pre-registered synthetic benchmark, centred on the conflict shape (zero-inflated, right-skewed, heavy-tailed, mixtures), swept over
n ∈ {32(fast), 128, 1024}and parameters, with a held-out split for generalization. FAO's real data is an end validation check only, never a training signal. The composite metric is feasibility-first (lexicographic): hard constraints (MAP ∈ HDI; consistency — error shrinks asngrows; vectorized within the memory budget) gate, then minimize bias + variance + stability-drift among feasible estimators — so the loop cannot trade away coherence/convergence/cost. Run the incumbent (PosteriorDistributionAnalyzermode) through the same benchmark to get a baseline number: "better" = beats it; if nothing does, the incumbent is accepted with evidence.Constraints / decision (needs-decision)
map_estimate's output and/or adding a nested-HDI API is a behavior/surface change. Per GOVERNANCE/ADR-018 (frozen v1.0): MINOR-with-deprecation vs MAJOR vs opt-in (method=/ newmasses=API)?Acceptance (when scheduled)
hdi(frame, masses=[…])API (closes C-33), with a nesting law in the conformance/property tests.MAP ∈ HDIguard; the pre-registered synthetic benchmark + feasibility-first metric as a versioned artifact; parity/scale tests; CHANGELOG + register C-32/C-33 resolution.References
src/views_frames_summarize/point.py:59-104,interval.py:23-48; register C-24 (portability fix), C-32, C-33. Producer side:views-modelsrusty_bucketensemble (pooled-sample delivery).