Estimator coherence: robust MAP for zero-inflated / right-skewed / low-sample posteriors (C-32; C-33 resolved by ADR-019)

**Separate effort** (not scheduled here) tracking the fix for register **C-32** (MAP bias) and **C-33** (HDI has no nesting/tower guarantee) — one shared problem. Surfaced by a views-faoapi integration spike (2026-06-23).

## Symptom
- **C-32 (MAP).** `map_estimate` breaks histogram-peak ties on the **lowest bin index** (`point.py:100`, `np.argmax(counts)`) — the C-24 portability fix. On real conflict posteriors (right-skewed, zero-inflated, heavy-tailed) the peak is often a multi-way tie, and lowest-index = **leftmost = smallest value**, so the mode is **pulled toward zero** (measured at the 32-draw speed setting: ~21% of active cells, one-directional, up to 7.9 ln; the magnitude shrinks at production sample counts but the bias *direction* and the non-convergence persist).
- **C-33 (HDI tower).** `hdi(frame, mass)` returns one mass's shortest interval per call and stops — no nesting guarantee across masses, no multi-mass API. On skewed/multimodal empirical samples a narrower band can poke **outside** a wider one.

(The *raw* shortest-interval HDI is otherwise bit-identical to the consumer's — the divergence is purely the **post-hoc enforcement** the consumer adds and we don't.)

## Why this is hard (not just a tie-break)
The mode is the only one of our estimates that is a functional of the **density**, not the **CDF**:
- mean = an average; quantiles/HDI = order statistics → need only the samples *ranked*. That's **why HDI is bit-identical**.
- mode = argmax of an *estimated density* → density estimation is inherently **regularized**; there is no assumption-free, tuning-free density estimate.

What we ship is the degenerate corner: a **nonparametric mode with a fixed (non-adaptive) bandwidth (100 bins) + an arbitrary tie-break** — neither parametric nor consistently nonparametric, so it is both **biased *and* non-convergent**. A fixed bin count **cannot** converge no matter how many samples are added. So **more samples alone is necessary, not sufficient.** (The estimator is already semi-parametric: the `zero_mass_threshold` rule is a zero-inflation model; the under-determined part is the **continuous-body mode**.)

## Production data-flow — the operating context (the FAO / `rusty_bucket` path)
The domain is **conflict forecasting**: zero-inflated, right-skewed, heavy-tailed posteriors — *the platform's data-generating process, not one user's slice.* The FAO-bound ensemble (in `views-models`, working name **`rusty_bucket`**, still in tuning) is being designed to **pool all constituent models' posterior draws** (≥128/model × ~8 models ≈ **1024 draws/cell**) and **ship the full pooled sample arrays downstream** — **not** a pre-collapsed `_best` scalar (as some existing ensembles do). So `views_frames_summarize` receives the **genuine pooled mixture** and performs the **single principled MAP/HDI collapse once, at the point of consumption.** Consequences for this issue:
- The estimator's input is a **real 8-component mixture** ⇒ **multimodality is first-class, not an edge case** — the dominant-mode-or-declare-non-unique contract (below) *is* the production path.
- The operating regime is **n ≈ 128–1024**; **32 is only the autoresearch speed knob**, never the design point.
- This is what makes #89 load-bearing: today the collapse is **fragmented and opaque** — `views-models` bakes `_best` for some paths, `views-faoapi` re-derives a histogram-mode for others, and **multimodality is handled in none of them**. Pooled-samples + one principled summarizer centralizes it. (Producer side tracked in the `views-models` `rusty_bucket` issue.)

## Target / acceptance property (what "correct" means here)
HDIs are **nested superlevel sets**: `HDI_α = {x : f(x) ≥ c_α}`, `c_α` decreasing in α, so `HDI_α ⊆ HDI_β` for α < β; the point in the intersection of all of them is the mode. So for a **single-dominant-mode** sample set, define the MAP **as the shrinking-HDI limit**, `MAP := lim(α→0) HDI_α`. Then `MAP ∈ HDI_α` for **all** α holds **by construction** — assumption-light, no parametric family, reusing the HDI machinery.
- **Multimodal / mixed inputs make a scalar MAP ill-posed** — the narrow HDI is disjoint points; a zero-atom forces `MAP → 0`. The estimator must then make an **explicit** choice — pick the **dominant** (most-mass) mode and say so, or **report the mode is non-unique** — never silently return a number.
- Keep two notions distinct: the **definition** is the α→0 limit (it *pins* the estimator); the **guard/test** checks `MAP ∈ HDI` at *moderate* α (50–95%), since the literal α→0 HDI is empirically degenerate.

## The HDI tower is the *same* coherence (C-33 ⇔ C-32)
Nesting and "MAP ∈ HDI" are one statement about **density level sets**. `{f ≥ c}` only grows as `c` falls ⇒ the family is **nested by construction**, and the mode is the top set. The consumer's analyzer *imitates* this post-hoc — expand each wider interval to contain the narrower, then **shift the narrowest to cover the MAP** — which makes the intervals no longer the true shortest and **couples the HDI to the (biased) MAP.** The right move is to **derive the whole family from one coherent object** (a smoothed density's level sets, or the **shrinking shortest-interval**), so the tower **and** the mode fall out mutually consistent.
- **API implication:** a multi-mass `hdi(frame, masses=[…])` returning a **guaranteed-nested** `(N, len(masses), 2)` tower (and, if wanted, the MAP as its α→0 limit).

## Candidate fixes

**Leading candidate — shrinking-HDI / half-sample mode (reuses code we already trust).**
The constructive realization of the target property: at small-but-finite mass the shrinking shortest-interval is a recognized **data-adaptive** mode estimator — the **shorth** and the **half-sample / Robertson–Cryer mode** (recurse on the shortest interval holding a fraction of the data). Advantages: **no fixed-bin bandwidth** (width adapts to local density), **reuses the bit-identical HDI machinery**, and run at every α it yields the **nested tower** for C-33 too. Catch: as `α→0`, `k = ⌊α·n⌋ → 0` → 1–2 samples (noisy); stop at small-but-finite and recurse.

**Other routes**
- **(A) Distributional assumption** — fit a family → analytic mode (+ analytic nested HDIs). Stable at low `n`; cost = model misspecification.
- **(B) `n`-adaptive smoothing + sufficient-`n` floor** — bins/bandwidth shrinking with `n` (e.g. `h ∝ n^(−1/5)`); level sets give nesting.

A better *tie-break* or a post-hoc *expand-to-nest* only patches symptoms — band-aids, not cures.

## Cheap diagnostic / possible interim guard (independent of the full fix)
For a **unimodal** density the mode must lie **inside** every HDI: `MAP ∈ HDI_α`. A MAP outside a moderate (50–95%) HDI is a *self-evident* sign the estimator isn't tracking density — a cheap, shippable **guard/regression test** (warn, don't hard-raise; scope to unimodal). Likewise a cheap **nesting check** across a mass-family flags C-33 violations.

## How "better" gets defined (so autoresearch is safe to point at this)
Because there is no ground truth on real posteriors, "better" must be an *operational* definition on a **pre-registered synthetic benchmark**, centred on the conflict shape (zero-inflated, right-skewed, heavy-tailed, **mixtures**), swept over `n ∈ {32(fast), 128, 1024}` and parameters, with a **held-out** split for generalization. FAO's real data is an **end validation** check only, never a training signal. The composite metric is **feasibility-first (lexicographic)**: hard constraints (`MAP ∈ HDI`; **consistency** — error shrinks as `n` grows; vectorized within the memory budget) gate, then minimize **bias + variance + stability-drift** among feasible estimators — so the loop *cannot* trade away coherence/convergence/cost. Run the incumbent (`PosteriorDistributionAnalyzer` mode) through the same benchmark to get a baseline number: "better" = beats it; if nothing does, the incumbent is accepted **with evidence**.

## Constraints / decision (needs-decision)
- **SemVer**: changing `map_estimate`'s output and/or adding a nested-HDI API is a behavior/surface change. Per GOVERNANCE/ADR-018 (frozen v1.0): MINOR-with-deprecation vs MAJOR vs opt-in (`method=` / new `masses=` API)?
- Scope guard (ADR-017): stays a sample-axis summarizer — no IO/domain/scoring.

## Acceptance (when scheduled)
- An estimator satisfying the **target property** (MAP ∈ HDI by construction for single-dominant-mode; explicit dominant-mode-or-declare-non-unique otherwise), with a **stability *and* bias** rationale — not a tie-break swap.
- A **nested-by-construction** HDI family + a multi-mass `hdi(frame, masses=[…])` API (closes C-33), with a nesting law in the conformance/property tests.
- The `MAP ∈ HDI` guard; the pre-registered synthetic benchmark + feasibility-first metric as a versioned artifact; parity/scale tests; CHANGELOG + register C-32/C-33 resolution.

## References
- Evidence: a views-faoapi integration spike (parity test + audit plots), 2026-06-23 — held in the consumer repo.
- `src/views_frames_summarize/point.py:59-104`, `interval.py:23-48`; register C-24 (portability fix), C-32, C-33. Producer side: `views-models` `rusty_bucket` ensemble (pooled-sample delivery).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Estimator coherence: robust MAP for zero-inflated / right-skewed / low-sample posteriors (C-32; C-33 resolved by ADR-019) #89

Symptom

Why this is hard (not just a tie-break)

Production data-flow — the operating context (the FAO / `rusty_bucket` path)

Target / acceptance property (what "correct" means here)

The HDI tower is the same coherence (C-33 ⇔ C-32)

Candidate fixes

Cheap diagnostic / possible interim guard (independent of the full fix)

How "better" gets defined (so autoresearch is safe to point at this)

Constraints / decision (needs-decision)

Acceptance (when scheduled)

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Estimator coherence: robust MAP for zero-inflated / right-skewed / low-sample posteriors (C-32; C-33 resolved by ADR-019) #89

Description

Symptom

Why this is hard (not just a tie-break)

Production data-flow — the operating context (the FAO / rusty_bucket path)

Target / acceptance property (what "correct" means here)

The HDI tower is the same coherence (C-33 ⇔ C-32)

Candidate fixes

Cheap diagnostic / possible interim guard (independent of the full fix)

How "better" gets defined (so autoresearch is safe to point at this)

Constraints / decision (needs-decision)

Acceptance (when scheduled)

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Production data-flow — the operating context (the FAO / `rusty_bucket` path)