Skip to content

fix(cli): --top N universally truncates all list-valued keys — closes #36#85

Merged
Wolfvin merged 2 commits into
mainfrom
fix/83-top-n-universal-truncation
Jun 29, 2026
Merged

fix(cli): --top N universally truncates all list-valued keys — closes #36#85
Wolfvin merged 2 commits into
mainfrom
fix/83-top-n-universal-truncation

Conversation

@Wolfvin

@Wolfvin Wolfvin commented Jun 29, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes issue #36: --top N was only truncating keys listed in a hardcoded allowlist (_LIST_KEYS, 28 names). Any new command returning a list under a non-standard key (e.g., entities, widgets) would silently ignore --top N, violating the "Limit list results to top N items" contract from SKILL-QUICK.md and risking token overflow for MCP clients.

Option chosen: A (auto-discover)

Why Option A over Option B:

The issue states: "--top N harus benar-benar truncate semua list output dari command manapun (yang ada sekarang maupun yang ditambah di masa depan) — tidak boleh ada command yang diam-diam lolos dari kontrak ini." This requires a runtime guarantee, not just a CI-time check. Option B (meta-test) only catches the gap at CI time — if a contributor skips the test or the test is broken, the bug still ships to production. Option A fixes the bug at runtime: every list-valued key is automatically truncated, regardless of its name.

Option A is also simpler (KISS, no dead code): a single auto-discover pass replaces both the _LIST_KEYS loop and the _NESTED_LIST_KEYS loop. No registration needed for new commands. No _EXEMPT_KEYS meta-test infrastructure. No CONTRIBUTING.md note about registering keys.

What changed

scripts/codelens.py

_apply_top_n rewritten — replaced two hardcoded allowlist loops (_LIST_KEYS + _NESTED_LIST_KEYS) with a single auto-discover pass:

  • Iterates ALL top-level keys in the result dict
  • Truncates every list-valued key whose length exceeds top_n
  • Also scans dict-valued keys for category-keyed lists (replaces _NESTED_LIST_KEYS)
  • Skips keys in _NO_TOP_KEYS (structural/metadata keys)
  • Skips keys starting with _ (internal keys like _truncated_categories)

_NESTED_LIST_KEYS removed — dead code after the auto-discover merge. Its logic (scanning results{category:[]}, issues{type:[]}, cycles{type:[]}) is now handled by the unified dict-scan in the auto-discover loop.

_NO_TOP_KEYS added — explicit opt-out set for keys that should never be truncated:

  • available_commands — help/list-commands metadata
  • engines — analyze engine list (structural)
  • supported_languages — setup info
  • categories — category names (metadata)
  • tags — tag names (metadata)

_LIST_KEYS kept — still used by _apply_lite (different purpose: priority-ordered primary result list lookup for lite mode, not truncation). Added a comment clarifying it's no longer used by _apply_top_n.

coverage_map special case retained — it's a dict-of-dicts ({file: {fn: {...}}}), not a dict-of-lists, so the auto-discover's dict-scan doesn't catch it. Kept as an explicit special case.

tests/test_cli.py

New test class TestTopNUniversalTruncation with 8 tests:

  1. test_non_standard_key_gets_truncated — the core issue [BUG-08] --top truncation only fires on hardcoded _LIST_KEYS — contract violation for new commands #36 repro
  2. test_multiple_non_standard_keys_truncated — multiple non-standard keys truncated simultaneously
  3. test_no_top_keys_exempt_NO_TOP_KEYS keys NOT truncated
  4. test_standard_keys_still_truncated — backward compat for existing keys
  5. test_top_zero_means_no_truncation_for_non_standard_key--top 0 unlimited works for non-standard keys
  6. test_nested_dict_non_standard_key_truncated — non-standard dict-of-lists truncated
  7. test_underscore_prefixed_keys_skipped — internal _-prefixed keys not processed
  8. test_repro_from_issue — exact repro from issue [BUG-08] --top truncation only fires on hardcoded _LIST_KEYS — contract violation for new commands #36 (1000-item entities list → truncated to 5)

Files changed

File Lines Description
scripts/codelens.py +39/-28 _apply_top_n auto-discover rewrite, _NESTED_LIST_KEYS removed, _NO_TOP_KEYS added
tests/test_cli.py +105/0 8 new regression tests in TestTopNUniversalTruncation

Verification

Issue #36 repro (manual)

>>> from codelens import _apply_top_n
>>> result = {"status": "ok", "entities": [{"id": i} for i in range(1000)]}
>>> out = _apply_top_n(result, 5)
>>> len(out["entities"])
5
>>> out["entities_truncated"]
True
>>> out["entities_total"]
1000

Before fix: len(out["entities"]) == 1000 (silent --top bypass).
After fix: len(out["entities"]) == 5

pytest tests/test_cli.py -v

$ PYTHONPATH=scripts python3 -m pytest tests/test_cli.py -v
============================= test session starts ==============================
collected 26 items

tests/test_cli.py ..........................                             [100%]

============================== 26 passed in 1.38s ==============================

All 26 tests pass (18 existing + 8 new).

pytest tests/test_codelens.py -v (existing _apply_top_n tests — backward compat)

$ PYTHONPATH=scripts python3 -m pytest tests/test_codelens.py -v
============================= test session starts ==============================
collected 68 items

tests/test_codelens.py ................................................. [ 71%]
...................                                                      [100%]

============================== 68 passed in 1.00s ==============================

All 68 existing _apply_top_n / _apply_lite / _sort_items tests pass — no backward compatibility regression.

Full suite: pytest tests/ -v --ignore=tests/test_integration.py

$ PYTHONPATH=scripts python3 -m pytest tests/ --ignore=tests/test_integration.py --tb=no -q
37 failed, 809 passed, 14 skipped in 13.28s

0 new failures introduced. All 37 failures are pre-existing on main (confirmed by git stash && pytest && git stash pop → identical 37 failures). The pre-existing failures are all in test_graph_model.py, test_graph_incremental.py, test_architecture.py, test_compact_format.py, test_hybrid_type_resolver.py, and test_vuln_staleness.py — they're related to graph database tables, test pollution (issue #83), and a stale vulnerability database (486 days old), not to --top N truncation.

Reference

Found but not fixed (out of scope)

Per scope-discipline skill, documenting for BOS:

_apply_top_n only truncated keys listed in the hardcoded _LIST_KEYS
allowlist (28 names). New commands returning lists under non-standard
keys (e.g., 'entities', 'widgets') silently ignored --top N, violating
the 'Limit list results to top N items' contract from SKILL-QUICK.md
and risking token overflow for MCP clients.

Fix: Option A (auto-discover) — replace the allowlist iteration with a
runtime scan of ALL top-level keys. Every list-valued key is truncated,
except those in _NO_TOP_KEYS (structural/metadata keys like
'available_commands'). Also scans dict-valued keys for category-keyed
lists, replacing the separate _NESTED_LIST_KEYS loop.

This makes --top N a universal contract: any command, present or future,
that returns a list under any key name is automatically truncated. No
registration needed.

Changes:
- _apply_top_n: replaced _LIST_KEYS + _NESTED_LIST_KEYS loops with
  single auto-discover pass over all result.keys()
- Removed _NESTED_LIST_KEYS (dead code — merged into auto-discover)
- Added _NO_TOP_KEYS exempt set for structural/metadata keys
- Kept _LIST_KEYS for _apply_lite (different purpose: priority-ordered
  primary result list lookup, not truncation)
- coverage_map special case retained (dict-of-dicts, not dict-of-lists)

Tests: 8 new tests in TestTopNUniversalTruncation covering:
- Non-standard key truncation (the issue #36 repro)
- Multiple non-standard keys truncated simultaneously
- _NO_TOP_KEYS exempt keys NOT truncated
- Standard keys still truncated (backward compat)
- --top 0 unlimited for non-standard keys
- Nested dict non-standard key truncation
- Underscore-prefixed keys skipped
- Exact repro from issue #36

Verified: 94/94 pass in test_cli.py + test_codelens.py.
Full suite: 0 new failures (37 pre-existing failures unchanged —
all graph/architecture/pollution issues unrelated to --top).

Closes #36.
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@Wolfvin Wolfvin merged commit 2b58aaa into main Jun 29, 2026
0 of 6 checks passed
@Wolfvin Wolfvin deleted the fix/83-top-n-universal-truncation branch June 29, 2026 17:09
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant