Skip to content

Estimator coherence: robust MAP for zero-inflated / right-skewed / low-sample posteriors (C-32; C-33 resolved by ADR-019) #89

Description

@Polichinel

Separate effort (not scheduled here) tracking the fix for register C-32 (MAP bias) and C-33 (HDI has no nesting/tower guarantee) — one shared problem. Surfaced by a views-faoapi integration spike (2026-06-23).

Symptom

  • C-32 (MAP). map_estimate breaks histogram-peak ties on the lowest bin index (point.py:100, np.argmax(counts)) — the C-24 portability fix. On real conflict posteriors (right-skewed, zero-inflated, heavy-tailed) the peak is often a multi-way tie, and lowest-index = leftmost = smallest value, so the mode is pulled toward zero (measured at the 32-draw speed setting: ~21% of active cells, one-directional, up to 7.9 ln; the magnitude shrinks at production sample counts but the bias direction and the non-convergence persist).
  • C-33 (HDI tower). hdi(frame, mass) returns one mass's shortest interval per call and stops — no nesting guarantee across masses, no multi-mass API. On skewed/multimodal empirical samples a narrower band can poke outside a wider one.

(The raw shortest-interval HDI is otherwise bit-identical to the consumer's — the divergence is purely the post-hoc enforcement the consumer adds and we don't.)

Why this is hard (not just a tie-break)

The mode is the only one of our estimates that is a functional of the density, not the CDF:

  • mean = an average; quantiles/HDI = order statistics → need only the samples ranked. That's why HDI is bit-identical.
  • mode = argmax of an estimated density → density estimation is inherently regularized; there is no assumption-free, tuning-free density estimate.

What we ship is the degenerate corner: a nonparametric mode with a fixed (non-adaptive) bandwidth (100 bins) + an arbitrary tie-break — neither parametric nor consistently nonparametric, so it is both biased and non-convergent. A fixed bin count cannot converge no matter how many samples are added. So more samples alone is necessary, not sufficient. (The estimator is already semi-parametric: the zero_mass_threshold rule is a zero-inflation model; the under-determined part is the continuous-body mode.)

Production data-flow — the operating context (the FAO / rusty_bucket path)

The domain is conflict forecasting: zero-inflated, right-skewed, heavy-tailed posteriors — the platform's data-generating process, not one user's slice. The FAO-bound ensemble (in views-models, working name rusty_bucket, still in tuning) is being designed to pool all constituent models' posterior draws (≥128/model × ~8 models ≈ 1024 draws/cell) and ship the full pooled sample arrays downstreamnot a pre-collapsed _best scalar (as some existing ensembles do). So views_frames_summarize receives the genuine pooled mixture and performs the single principled MAP/HDI collapse once, at the point of consumption. Consequences for this issue:

  • The estimator's input is a real 8-component mixturemultimodality is first-class, not an edge case — the dominant-mode-or-declare-non-unique contract (below) is the production path.
  • The operating regime is n ≈ 128–1024; 32 is only the autoresearch speed knob, never the design point.
  • This is what makes Estimator coherence: robust MAP for zero-inflated / right-skewed / low-sample posteriors (C-32; C-33 resolved by ADR-019) #89 load-bearing: today the collapse is fragmented and opaqueviews-models bakes _best for some paths, views-faoapi re-derives a histogram-mode for others, and multimodality is handled in none of them. Pooled-samples + one principled summarizer centralizes it. (Producer side tracked in the views-models rusty_bucket issue.)

Target / acceptance property (what "correct" means here)

HDIs are nested superlevel sets: HDI_α = {x : f(x) ≥ c_α}, c_α decreasing in α, so HDI_α ⊆ HDI_β for α < β; the point in the intersection of all of them is the mode. So for a single-dominant-mode sample set, define the MAP as the shrinking-HDI limit, MAP := lim(α→0) HDI_α. Then MAP ∈ HDI_α for all α holds by construction — assumption-light, no parametric family, reusing the HDI machinery.

  • Multimodal / mixed inputs make a scalar MAP ill-posed — the narrow HDI is disjoint points; a zero-atom forces MAP → 0. The estimator must then make an explicit choice — pick the dominant (most-mass) mode and say so, or report the mode is non-unique — never silently return a number.
  • Keep two notions distinct: the definition is the α→0 limit (it pins the estimator); the guard/test checks MAP ∈ HDI at moderate α (50–95%), since the literal α→0 HDI is empirically degenerate.

The HDI tower is the same coherence (C-33 ⇔ C-32)

Nesting and "MAP ∈ HDI" are one statement about density level sets. {f ≥ c} only grows as c falls ⇒ the family is nested by construction, and the mode is the top set. The consumer's analyzer imitates this post-hoc — expand each wider interval to contain the narrower, then shift the narrowest to cover the MAP — which makes the intervals no longer the true shortest and couples the HDI to the (biased) MAP. The right move is to derive the whole family from one coherent object (a smoothed density's level sets, or the shrinking shortest-interval), so the tower and the mode fall out mutually consistent.

  • API implication: a multi-mass hdi(frame, masses=[…]) returning a guaranteed-nested (N, len(masses), 2) tower (and, if wanted, the MAP as its α→0 limit).

Candidate fixes

Leading candidate — shrinking-HDI / half-sample mode (reuses code we already trust).
The constructive realization of the target property: at small-but-finite mass the shrinking shortest-interval is a recognized data-adaptive mode estimator — the shorth and the half-sample / Robertson–Cryer mode (recurse on the shortest interval holding a fraction of the data). Advantages: no fixed-bin bandwidth (width adapts to local density), reuses the bit-identical HDI machinery, and run at every α it yields the nested tower for C-33 too. Catch: as α→0, k = ⌊α·n⌋ → 0 → 1–2 samples (noisy); stop at small-but-finite and recurse.

Other routes

  • (A) Distributional assumption — fit a family → analytic mode (+ analytic nested HDIs). Stable at low n; cost = model misspecification.
  • (B) n-adaptive smoothing + sufficient-n floor — bins/bandwidth shrinking with n (e.g. h ∝ n^(−1/5)); level sets give nesting.

A better tie-break or a post-hoc expand-to-nest only patches symptoms — band-aids, not cures.

Cheap diagnostic / possible interim guard (independent of the full fix)

For a unimodal density the mode must lie inside every HDI: MAP ∈ HDI_α. A MAP outside a moderate (50–95%) HDI is a self-evident sign the estimator isn't tracking density — a cheap, shippable guard/regression test (warn, don't hard-raise; scope to unimodal). Likewise a cheap nesting check across a mass-family flags C-33 violations.

How "better" gets defined (so autoresearch is safe to point at this)

Because there is no ground truth on real posteriors, "better" must be an operational definition on a pre-registered synthetic benchmark, centred on the conflict shape (zero-inflated, right-skewed, heavy-tailed, mixtures), swept over n ∈ {32(fast), 128, 1024} and parameters, with a held-out split for generalization. FAO's real data is an end validation check only, never a training signal. The composite metric is feasibility-first (lexicographic): hard constraints (MAP ∈ HDI; consistency — error shrinks as n grows; vectorized within the memory budget) gate, then minimize bias + variance + stability-drift among feasible estimators — so the loop cannot trade away coherence/convergence/cost. Run the incumbent (PosteriorDistributionAnalyzer mode) through the same benchmark to get a baseline number: "better" = beats it; if nothing does, the incumbent is accepted with evidence.

Constraints / decision (needs-decision)

  • SemVer: changing map_estimate's output and/or adding a nested-HDI API is a behavior/surface change. Per GOVERNANCE/ADR-018 (frozen v1.0): MINOR-with-deprecation vs MAJOR vs opt-in (method= / new masses= API)?
  • Scope guard (ADR-017): stays a sample-axis summarizer — no IO/domain/scoring.

Acceptance (when scheduled)

  • An estimator satisfying the target property (MAP ∈ HDI by construction for single-dominant-mode; explicit dominant-mode-or-declare-non-unique otherwise), with a stability and bias rationale — not a tie-break swap.
  • A nested-by-construction HDI family + a multi-mass hdi(frame, masses=[…]) API (closes C-33), with a nesting law in the conformance/property tests.
  • The MAP ∈ HDI guard; the pre-registered synthetic benchmark + feasibility-first metric as a versioned artifact; parity/scale tests; CHANGELOG + register C-32/C-33 resolution.

References

  • Evidence: a views-faoapi integration spike (parity test + audit plots), 2026-06-23 — held in the consumer repo.
  • src/views_frames_summarize/point.py:59-104, interval.py:23-48; register C-24 (portability fix), C-32, C-33. Producer side: views-models rusty_bucket ensemble (pooled-sample delivery).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestriskTechnical risk registersummarizeviews_frames_summarize package

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions