perf(schemaview): incremental walker + safety net (audit only)#10002
Draft
asheshv wants to merge 34 commits into
Draft
perf(schemaview): incremental walker + safety net (audit only)#10002asheshv wants to merge 34 commits into
asheshv wants to merge 34 commits into
Conversation
Replaces O(total fields) per-keystroke walks of the schema option
tree with O(visited fields) pruning. Two walkers are made
incremental:
schemaOptionsEvalulator
- Threads `changedPath` + DepListener-derived `depDests` through
every level of recursion as `mustVisit`.
- Collection rows whose globalPath doesn't overlap any
mustVisit entry are kept as their previous-walk options via
structural sharing (the row's whole subtree retains its
previous object reference, so downstream subscribers
short-circuit on Object.is).
- Nested-tab / nested-fieldset / inline-groups share the
parent's data level; recursion threads the same prev-options
slice through.
validateSchema
- Same prune logic but for error map: rows outside mustVisit
keep their prior error state instead of being re-walked.
Cross-row reads remain the known correctness hazard: a closure
(`editable`, `disabled`, `visible`, `readonly`, `validate`) that
reads a sibling row without declaring the source as `field.deps`
will silently see stale data. This commit ships the prototype with
that hazard documented on the file; the safety net (canary +
audit harness) lands in subsequent commits.
Other supporting changes:
- Per-schema opt-in (`incrementalOptions = true`) and a
window-level `__INCREMENTAL_OPTIONS__` flag for canarying
without rebuilding plumbing. Initially flipped on for
TableSchema, IndexSchema, PartitionSchema, ColumnSchema,
DomainSchema; expanded later via the audit ratchet.
- Six grid-cell evaluators (partition values, exclusion ops,
index storage params, etc.) get explicit `field.deps`
declarations for sibling-row reads.
- DepListener-reverse-deps fold into the walker so cross-row
deps stay correct under incremental mode.
- Subscribe hooks pin useEffect deps so the SubscriberManager
doesn't tear subs down/up on every render (C.1).
- SubscriberManager.signal stops re-creating subscriptions (C.2).
- DataGridView mappedCell.jsx render path trimmed.
- checkUniqueCol short-circuits when the changedPath can't
affect any uniqueCol.
- Performance bench harness (Playwright) with INCREMENTAL=1 flag
for same-session A/B comparison + gated instrumentation
(record() / measure() / count() in perf.js).
- Per-incremental-options + parent-row deps Jest tests.
Performance (same-session A/B, nested.spec OUTER=10):
INNER=500: outer typing 2.60x, inner typing 1.79x faster
INNER=1000: outer typing 3.79x, inner typing 2.00x faster
OUTER=500 INNER=10: both axes 1.88x faster.
Cluster of correctness fixes that fell out of building the
incremental walker. None alter user-visible behavior on their own;
together they harden the dep-listener plumbing and the deferred-
dependency-change queue so the walker can rely on consistent
inputs.
Deferred-dep protocol
---------------------
Establishes a documented contract for `field.deferredDepChange` so
schemas can return Promises without leaking pending updates or
losing recovery paths:
- Return undefined to opt out (no Promise queued).
- Otherwise return a Promise that ALWAYS settles. On success,
resolve with `(tmpstate) => deltaObj`. On failure, prefer
resolving with a recovery callback that resets in-progress
flags rather than rejecting.
- Side effects belong inside the Promise body BEFORE resolving;
exception: side effects whose input legitimately depends on
drain-time state may live in the returned callback.
Five existing schemas were migrated to the protocol (cleanup of
inline rejections and unresolved-promise leaks):
- TableSchema typname + coll_inherits
- ForeignTable.inherits
- ExclusionConstraint.amname
- Index.amname
- AzureCredSchema.is_authenticating
Drain-queue plumbing
--------------------
- Append to __deferred__ instead of replacing: two SET_VALUEs in
the same React batch each contribute their pending promises;
replacing the array dropped the earlier action's.
- useEffect depends on the array REFERENCE, not its length:
React's auto-batching can round-trip length 0 → N → 0 in one
commit, and length-based equality misses the round-trip.
- Prefix-match protection in DepListener.getDeferredDepChange so
a listener bound at ['a'] doesn't accidentally fire on ['ab'].
- Source path snapshot in addDepListener so caller mutations to
the array after registration can't corrupt the listener entry.
- Drain protocol guard: resFunc must be a function — log and
skip otherwise. Rejected promises surface via notifier.error
(with console.error fallback when notifier is unavailable, eg
in jest harnesses).
- Protocol-violation log promoted warn → error so test suites
that fail on console.error catch the regression.
DataGridView + schema misc
--------------------------
- DepListener caches joined source keys (the inner loop hot path
in the walker's prev-dep lookup); early-bail when no listener
registered.
- mappedCell.jsx render path trimmed.
- Six small bug fixes that surfaced during the audit:
* row className uses join(' '), not join[' '] (was concat
of literal " ")
* 'priorty' → 'priority' typo in DataGridView feature
comparator
* TableSchema typname callback guarded against stale
ofTypeTables (race when user changes type quickly)
* typname changeColumnOptions moved out of resolved callback
so it runs synchronously before drain
* Azure auth_btn compares against the last source segment
(was full path) — fix surfaces auth failures correctly
* Azure clears stale auth_code when auth fails, resets
is_authenticating
* Schema inherits REMOVE branch guards against undefined
getTableOid result
* Stale columns cleaned up on same-length inherits swap
- Tests: deterministic race test for drain useEffect dep array
(verifies APPEND semantics under fast double-dispatch).
- perf-bench nested.spec timeout raised from 300s to 500s.
Prerequisite for the canary safety net and the audit harness that
follow.
The incremental walker prototype prunes rows whose subtree the
changedPath cannot affect. Cross-row reads that aren't declared as
`field.deps` will silently see stale data — the prototype's known
limitation. Without an automated detection mechanism, every schema
flipping incremental on would need a manual review of every
closure in its `editable` / `disabled` / `visible` / `readonly` /
`validate` callbacks.
The canary is that detection mechanism. Two canaries land here:
options/canary.js (runOptionsCanary)
Wraps schemaOptionsEvalulator. When `window.__INCREMENTAL_AUDIT__`
is set, runs BOTH the full walk and an incremental walk against
the same prev-options + sessData baseline, diffs the resulting
options trees, and reports any divergence.
SchemaState/validation_canary.js (runValidationCanary)
Same shape but for validateSchema's error map. Mirrors the
options canary almost exactly so the two stay in lockstep.
Routing (defaultReport):
- Production with `window.__incremental_canary_endpoint__`
configured: navigator.sendBeacon to that endpoint.
- Test env (NODE_ENV=test) with `__throw_on_canary_divergence__`:
throws an Error with the diff — what Jest assertions catch.
- Otherwise: console.error.
The branches are mutually exclusive so the test suite's
`expect(console.error).not.toHaveBeenCalled()` afterEach doesn't
fight the "thrown" path.
Build-time gate
---------------
`process.env.__CANARY_BUILD__` is substituted to literal `false`
(or true under `CANARY_BUILD=true yarn run bundle`) via webpack
DefinePlugin. In a default production build the entire canary
branch + its `require('./canary')` is dead-code eliminated. The
canary module ships ZERO bytes to end users.
The DefinePlugin substitutes a literal boolean (`process.env.X` →
`true`/`false`), NOT `JSON.stringify(...)` — `JSON.stringify(false)`
would yield the string "false" which is truthy in JS and defeats
DCE in the false branch.
Verification
------------
- V5/M4 integration tests confirm both canaries fire on
synthetic schemas with intentional cross-row divergence
(option-side AND validation-side patterns).
- V6 bundle smoke test confirms the canary doesn't appear in a
non-canary build (later promoted to a CI shell script in the
production-hardening commit).
- Multiple aggressive-review fixes wired in to the canary
wrapper: allowlist with TTLs for known-stale entries,
per-session throttle so production sampling doesn't pay the
2x walk cost on every keystroke, FIELD_OPTIONS diff format
that's stable for grep-based debugging.
Together these form the safety net that lets the incremental
walker stay on by default for every schema — divergence becomes a
caught error, not a silent UI bug.
Production-flip blocker: the canary safety net catches divergence
when it fires, but only against schemas that have been opened in a
browser with `__INCREMENTAL_AUDIT__` enabled. To gate the global
flip behind CI, every registered schema needs synthetic dispatch
coverage that runs on every PR.
Three layers:
schema_registry.js
`registerSchema(SchemaClass)` records every BaseUISchema
subclass at module-load time. `getRegisteredSchemas()` returns
the registry as a Map.
Throws on anonymous classes / non-class arguments so a future
contributor can't accidentally register a factory function.
ESLint rule (eslint-plugins/local-rules/register-schema.js)
Flags any `export default class extends BaseUISchema {...}`
that isn't wrapped in `registerSchema(...)`. Makes the
registration contract enforceable in code review.
AST codemod (scripts/codemod-register-schema.js)
@babel/parser-driven, idempotent. Rewrites
`export default class Foo extends BaseUISchema {...}`
into
`class Foo extends BaseUISchema {...}
export default registerSchema(Foo);`
Migrated 86 default-exported schemas in one shot.
audit_harness.js
The per-schema audit runner. For one SchemaClass:
1. tryInstantiate with several constructor signatures
(no-args, fieldOptions+initValues, generic stub fixtures).
2. Generate default sessData via schema.getNewData({}).
3. Establish baseline via the FULL walk (the canary's
reference) to populate prev-options.
4. For each scalar field, dispatch a synthetic SET_VALUE
with the canary on; divergence throws and the calling
spec fails fast with the diff.
5. Same for collection cells, plus ADD_ROW / DELETE_ROW
structural dispatches at the collection root.
registered_schemas_audit.spec.js
Loops over the registry and calls auditSchema for each. A
KNOWN_DIVERGING allowlist (initially empty, the ratchet) lets
expected-to-diverge schemas stay GREEN until they're fixed —
when a schema is fixed and stops diverging, the test starts
FAILING because the divergence didn't happen. That's the
signal to remove it from the list.
Other supporting changes:
- Multi-path error tracking: SchemaState now maintains
`_knownErrorPaths` so previously-erroring rows ride mustVisit
forever after, even if the user's current change doesn't
touch that row. Catches re-errors without a full walk.
- TableSchema partition fields get declared `field.deps` so the
canary stays clean across the existing flip targets.
- Canary resilience: throttle cap, allowlist TTL semantics,
per-schema canaryAllowedDivergences hook for triage.
- Tree-shake sentinels extended to audit_harness so the audit
module itself is dead-code eliminated from production builds.
- tryInstantiate failure messages report the RICHEST attempt's
error, not the no-args one (which is almost always "X is not
a function" and masks the real problem).
- RoleSchema-specific instantiation fixture (its constructor
needs a real userInfo arg that the generic fallbacks can't
synthesize).
Opens real pgAdmin dialogs in a real browser with __INCREMENTAL_AUDIT__ + __throw_on_canary_divergence__ enabled. Asserts no console.error or pageerror at end. Covers three create dialogs: Register Server, Create Table, Create Function. audit-helpers.js holds shared boot/error-recorder utilities + an ensureServerRegistered that uses a pre-seeded SQLite ID. Requires CANARY_BUILD=true bundle so the canary stays in the bundle. Tree navigation is DOM-based here (best-effort against react-aspen virtualization). The JS tree API variant lands in the production hardening group.
Three real bugs found by the audit + UI smoke:
listenDepChanges registered NO listener for fields with declared
deps but no depChange callback. The walker's
_collectDepDestsForPath couldn't resolve evaluator-only deps so
rows whose closures read sibling state via field.deps were
silently pruned (vacuum_table.value editable was the canary's
first catch).
state.__lastChangedPath was a single scalar overwritten by React
batching. Two sibling fixedRows promises (vacuum_table +
vacuum_toast) resolving in one microtask lost one path entirely.
Replaced with __pendingChangedPaths accumulator + validate()
consumes the full set; legacy __lastChangedPath kept as a
back-compat shim only when the accumulator is empty.
drainDeferredQueue dispatched raw sessDispatch, bypassing the
listener wrapper. Every deferredDepChange's DEFERRED_DEPCHANGE
fired the bypass guard AND dropped its path from the accumulator.
Now routed via sessDispatchWithListener.
Audit + smoke infrastructure additions:
- Per-schema fixtures for the last 12 audit SKIPs.
- 87 schemas x 2 modes (create + edit) with realistic edit-mode
data seed (idAttribute + populated text fields) and a thicker
schema.state stub so closures reading top.state.X see real
shapes.
- MOVE_ROW + BULK_UPDATE passes.
- 3-row collections with per-row sentinels for chained reads.
- 6-variant scalar mutations (empty / whitespace / unicode / long
/ two sentinels).
- Batched-dispatch combinations k=2..4 with up to 2 rotations per
combo.
- 10-step sequence pass with persisted prev across dispatches
(closes the bug-class that started this whole session: stale
prev across batched commits).
- Edit-mode UI smoke (Edit Table, Edit Function) using a new
openEditDialogViaApi helper.
- UI smoke navigates via pgAdmin's JS tree API instead of DOM
expansion (works around react-aspen virtualization).
Production safety:
- LRU cap (1024 entries) on _knownErrorPaths with insertion-order
eviction + recency refresh; one-shot console.warn under canary
on first eviction + perf-counter telemetry on every eviction.
- Bypass guard: reducer fires console.error on path-bearing
actions (SET_VALUE, ADD_ROW, DELETE_ROW, MOVE_ROW, BULK_UPDATE,
DEFERRED_DEPCHANGE) missing the __viaListener sentinel. Under
canary builds only; production tree-shakes it out. Fails Jest
because setup-jest asserts console.error is never called.
- CI script web/scripts/verify-canary-treeshake.sh greps the
non-canary production bundle for canary-only symbols; non-zero
exit if any leak.
- Jest mocks for 5 ESM-only deps (react-data-grid, react-dnd,
react-dnd-html5-backend, react-resize-detector, marked); plus
a global define() shim for AMD modules. Together these unblock
16 audit-blocked schemas + 10 pre-existing failing suites.
- Developer guide at web/pgadmin/static/js/SchemaView/README.md
covering the correctness contract, deps syntax, canary build,
audit layers, dispatch rules, and common pitfalls.
Test results: 166/166 suites, 1201/1201 tests in 47s. UI smoke 5/5
dialogs in 25s. Same-session perf bench unchanged: 10x1000 outer
typing 3.79x faster, inner typing 2.00x faster.
…es (pgadmin-org#9985) The container previously applied CAP_NET_BIND_SERVICE to the python interpreter so the non-root pgadmin user could bind to ports 80/443. Some platforms refuse to honor file capabilities: - --cap-drop=ALL / OpenShift restricted-v2 SCC zero the bounding set, so the kernel returns EPERM on exec of any capability-tagged binary. This makes the image fail to start (issue pgadmin-org#9657). - --security-opt=no-new-privileges / allowPrivilegeEscalation: false causes the kernel to silently strip file capabilities on exec, so the binary runs but a subsequent bind() to <1024 still fails. Split the interpreter so neither default behavior nor restricted-runtime support has to give up the other: - Dockerfile copies python3.X to /usr/local/bin/python3-cap and applies setcap to the copy. /usr/local/bin/python3.X stays un-capped, so /venv/bin/python3 (which symlinks to it) execs cleanly under restricted SCCs. A parallel /venv/bin/python3-cap symlink keeps the venv activation working when the capped interpreter is used. - entrypoint.sh reads /proc/self/status at startup. If NoNewPrivs is set, or CAP_NET_BIND_SERVICE is missing from the bounding set, gunicorn is invoked through the un-capped python and (when PGADMIN_LISTEN_PORT is unset) the default port falls back to 8080 for plain HTTP or 8443 for TLS. A startup message records the choice. - Existing deployments with the default 80/443 mapping are unaffected: on every unrestricted runtime the bounding set still contains NET_BIND_SERVICE and gunicorn runs through the capped interpreter exactly as before. - PGADMIN_LISTEN_PORT, if set, is honored in both paths. Docs gain a "Restricted Security Contexts" subsection covering the new auto-detected fallback and the OpenShift / --cap-drop=ALL invocation. Fixes pgadmin-org#9657
…ations
Six audit/doc polish items from the post-rebase punch list, bundled
because they share the same harness file and run as one Jest pass.
Properties mode coverage
------------------------
MODES = ['edit', 'create', 'properties']. The 'properties' mode
filters fields by mode containing 'properties' — a different
field subset than create/edit. Read-only display dispatches no
user input but the walker still runs on every prop change, so
divergence under properties is a real bug class.
Tests grow from 174 to 261 (87 schemas x 3 modes); all pass.
Nested-fieldset / nested-tab / inline-groups dispatch
-----------------------------------------------------
New auditNestedFields pass. Walks one level into every group
field of type nested-fieldset / nested-tab / inline-groups,
dispatches a SET_VALUE on each scalar inside. These shared the
parent's data level but lived in the walker's nested branch (a
different code path from auditScalars); bugs that only
manifested in that branch slipped through.
Production users covered: publication, trigger, table, index,
type, sequence, pga_schedule, compound_trigger and others.
All-k rotations in auditBatched
-------------------------------
MAX_ROTATIONS_PER_COMBO bumped from 2 to Infinity (effectively
k). Each path in a k-combo gets a turn as `primary` instead of
just the first two — closes the order-dependence gap (for k=4
was 2/24 perm coverage, now 4/24 with every PATH primary
covered).
Runtime impact: audit suite went from 52s to 60s.
RERENDER action documentation
-----------------------------
RERENDER is declared in SCHEMA_STATE_ACTIONS but has NO reducer
case and NO production dispatch site. Documented in the enum as
reserved/unused with an explicit note: if a future caller starts
using it, they MUST add it to reducer.js's PATH_BEARING_ACTIONS
set so the bypass guard catches accidental raw sessDispatch.
Mutation-counter reset in setup-jest
------------------------------------
Module-level capture of _resetMutationCounter (resolved once at
worker boot, called in beforeEach). Without this, the 6-variant
text-mutation cycle's position bleeds across specs, making CI
failures harder to reproduce locally. The resolve must happen at
module load (not inside beforeEach) because requiring the audit
harness from beforeEach transitively pulls in the zustand mock
whose top-level afterEach() registration would then run in
test phase — caught a 15-suite regression during verification.
Developer guide additions
-------------------------
README.md now includes:
- Action types table with bypass-guard coverage column
(SET_VALUE / ADD_ROW / DELETE_ROW / MOVE_ROW / BULK_UPDATE /
DEFERRED_DEPCHANGE guarded; INIT / CLEAR_DEFERRED_QUEUE
exempt; RERENDER reserved)
- "Reading canary divergence output" section with worked
example (the vacuum_table divergence pattern) + the three
root-cause patterns surfaced this session
- Updated bypass-guard description (console.error, not warn)
Test scoreboard:
Before: 1201/1201 in 47s
After: 1287/1287 in 72s (+86 from new modes/passes)
Adds fast-check (devDependency) and a fuzz spec that drives RANDOM
k-batches across every registered schema in every mode. Closes the
last open coverage item from the post-review punch list.
What the fuzzer adds over deterministic auditBatched:
- Random batch sizes (k in [2, 6]) instead of fixed k in {2, 3, 4}.
- Random path-INDEX permutations (deterministic sweep uses fixed
candidate emission order + k primary rotations; fuzzer covers
the cross-product of extras orderings the rotations can't).
- Shrinking. When a property fails, fast-check shrinks the
schema/mode/index-array to a minimal reproducer — the
counterexample lands in the test failure message and points
new contributors at the smallest path-set that triggers
divergence.
The fuzzer's payoff isn't today (every fuzz run is GREEN); it's the
day someone introduces a closure with an undeclared cross-row read
and the audit's deterministic sweep misses the specific batch
shape. The shrinker will pin it down without manual triage.
fuzzBatchAgainst (audit_harness.js export)
Single-batch driver: takes (SchemaClass, mode, pathIndices),
instantiates, seeds, runs the FULL mount-time validate to
populate knownErrorPaths (the harness equivalent of
SchemaState's _knownErrorPaths populated on mount), then
drives one batched dispatch against the canary. Returns
{ok, candidates, batch, message} so fast-check's shrinker can
work the minimal counterexample.
Path indices are taken modulo the candidate list length so any
input array of length >= 2 maps to a valid k-combination;
duplicates dedupe silently so the shrinker can collapse same-
path inputs without us special-casing.
audit_fuzz.spec.js
Property: for any random (schemaName, mode, pathIndices), the
incremental walker output equals the full walker output.
Defaults to 500 numRuns; total runtime ~3s with shrinking
disabled in this happy-path config (verbose: false). Total
Jest suite goes from 1287 to 1288 tests in 75s.
One real harness bug surfaced while writing this — initial
fuzzBatchAgainst skipped the mount-time validate that populates
knownErrorPaths. Production always validates on mount BEFORE any
user dispatch, so the incremental walker's mustVisit always
includes any pre-existing error paths. Without the prep, the
fuzzer's first incremental walk pruned paths that production
wouldn't and reported false-positive divergences on schemas with
pre-existing validation errors (TriggerFunctionSchema's
seclabels.*.label was the first shrunk repro). Fix: run the
discovery walk before the batched dispatch. The same fix was
already present in auditSchema; fuzzBatchAgainst now mirrors it.
Two new workflows that close the CI gap on the incremental walker's
safety stack. The existing run-javascript-tests.yml already picks
up the new 1288-test Jest suite (setup-jest.js sets
__CANARY_BUILD__='true' unconditionally), so the audit + canary +
fuzz + sequence + nested-fieldset + properties-mode + 3-row +
6-variant + all-k-rotation passes all run in CI today. The two
remaining gaps are:
1. The canary tree-shaking gate
------------------------------
check-canary-treeshake.yml: lightweight ubuntu-22.04 job that
builds a NON-canary production bundle (CANARY_BUILD intentionally
unset) and runs scripts/verify-canary-treeshake.sh. The script
greps the resulting app.bundle.js for canary-only symbols
(runOptionsCanary, runValidationCanary, formatDivergence,
applyAllowlist, __throw_on_canary_divergence__). If any leak,
the DefinePlugin gate failed and the canary (~7 KiB) is
shipping to end users along with the 2x walk cost on every
keystroke.
No PG service, no Python, no pgAdmin runtime — just node +
webpack + grep. Runs on every PR.
2. The Playwright UI smoke
--------------------------
run-schemaview-ui-smoke.yml: ubuntu-22.04 job that:
- installs PostgreSQL 16 from PGDG, starts it on port 5916
- creates web/config_local.py with desktop mode, no master
password, no OS secret storage (so the smoke's connect
flow can drive the "Connect to Server" prompt without an
Unlock-Saved-Passwords detour)
- pre-registers a server via `python setup.py load-servers`
from /tmp/servers.json (the smoke's
ensureServerRegistered helper looks for an existing server;
without seeding it finds none in fresh CI)
- builds with CANARY_BUILD=true so the canary stays in the
bundle
- launches pgAdmin in the background + polls /browser/ until
it responds
- installs Playwright + chromium
- runs audit-smoke.spec.js with PGADMIN_SERVER_NAME=CI-PG16
pointing at the seeded server
Five dialogs covered: Register Server (canary on top-level
scalars + DataGridView dispatches), Create Table (TableSchema +
ColumnSchema + Constraints + Partition), Create Function
(FunctionSchema + NodeVariableSchema), Edit Table (TableSchema
in edit mode against an existing system table), Edit Function
(FunctionSchema in edit mode).
Failure artifacts: pgAdmin server log + Playwright trace +
screenshots uploaded on any non-success path so a flake or real
divergence can be triaged from the failed run.
Security audit of both files: only ${{ github.* }} interpolation
is concurrency.group (a YAML-level identifier, not shell context).
All run: blocks use static content. env: blocks use hardcoded
constants. No untrusted-input → shell paths. Matches the existing
workflow security pattern in the repo.
The audit harness ratchet on dev/incremental-audit reached zero — all 86 registered schemas pass cleanly under the incremental walker. Flip the conditional in SchemaOptionsEvalulator (options/registry.js) and SchemaState.validate so any dispatch with a concrete changedPath takes the pruned path. Schemas / dialogs that genuinely need full-walk semantics can opt out by setting `incrementalOptions: false` on viewHelperProps or the schema instance; the global `window.__INCREMENTAL_OPTIONS__ = false` is an emergency-rollback escape hatch that disables it everywhere without rebuilding. SchemaState.updateOptions now propagates schema-level opt-out into viewHelperProps so the walker (which only reads vhp) honors it. The mirror logic for opt-IN stays for back-compat in case the default ever shifts back. Test updates: - The "incremental (window global opt-in)" describe block flipped to "window global escape hatch" — afterEach uses `delete` instead of `= false` so the cleanup doesn't accidentally leave the opt-out signal set for subsequent tests. - "schema without incrementalOptions runs full walk" now asserts default-on (the schema needs an explicit `false` to opt out); added a separate test for the explicit opt-out path. Full jest suite: 1097/1097, including the audit harness reporting zero divergences across all registered schemas.
The datagridview perf bench compares per-keystroke + per-action
costs in pgAdmin's SchemaView. To validate the incremental walker
flip (default-on), the spec needs to run twice in the same session
— once with incremental ON (default), once OFF — so cross-session
drift doesn't muddle the delta (see memory note about same-session
A/B requirement).
`INCREMENTAL_OFF=1` toggles `window.__INCREMENTAL_OPTIONS__ = false`
before each dispatch, opting out of the flip's default-on
behavior. Result files prefixed `on-` vs `off-` so a diff is
straightforward.
A/B results captured on this branch (Register Server > Parameters,
100 rows, 20 keystrokes per scenario):
Grid cell typing: validate -33% (3.40ms vs 5.07ms / dispatch)
General Name typing: validate -36% (3.25ms vs 5.12ms / dispatch)
schemaOptionsEvalulator: -85% to -86%
SchemaState.updateOptions: -84% to -86%
validateSchema: +6% to +8% (collectAll's no-short-circuit cost)
setError calls: identical (collectAll doesn't increase
volume in this fixture)
ADD_ROW (100 rows): ~no change (collection-level changedPath
overlaps every row's path → effectively
a full walk anyway)
Per-keystroke walker code: ON ~3.3ms vs OFF ~7.3ms. The dominant
cost (options evaluation) drops by 84-86%; the small validateSchema
regression from removing short-circuit is dominated by the wins.
The infrastructure (audit harness, multi-path tracker, canary
resilience) DID NOT cost us perf — it shipped alongside a net win.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Three CI fixes from the first run of PR pgadmin-org#10002: 1. ESLint indent + no-console + display-name + import-rule Local dev passed because yarn run test:js-once linted on the way through but the runner caught extra cases on a clean clone. 78 errors → 0. - audit_harness.js rotation loop had inconsistent indent around the inner combo/extras block (~70 lines) - perf-bench/ scripts use console.log to emit per-keystroke numbers; they're diagnostic dev scripts, added the directory to .eslintrc.js's ignores - react-resize-detector mock's withResizeDetector HOC lacked displayName on the wrapped component - perf.js had a stale 'import/no-unassigned-import' disable comment referencing a rule the project doesn't load 2. schemaview-ui-smoke 'Start PostgreSQL' step failure ubuntu-22.04 runners come with an older PostgreSQL pre- installed; the new install of pg16 didn't recreate /etc/postgresql/16/main/ when a conflicting cluster was present. Added a 'Uninstall any pre-installed PostgreSQL' step mirroring run-feature-tests-pg.yml's pattern, runs before the pg16 install. 3. Cascade: run-javascript-tests (3 OS), run-feature-tests-pg (5 versions), and build-container all failed on the same lint preflight as their first step. Fix pgadmin-org#1 cascade-fixes them all. verify-canary-treeshake passed on the first CI run; no changes needed there. All 1288 Jest tests still pass locally.
# Conflicts: # web/pgadmin/static/js/SchemaView/options/registry.js
Locally replayed the UI smoke workflow against an isolated SQLite
+ the existing pgAdmin codebase and confirmed:
- JSON server format (Servers / Name / Group / Port / Username /
Host / MaintenanceDB / SSLMode) is the right shape.
- load-servers requires the SQLite to already exist; bare
workflow run errored 'SQLite database file does not exist'
before adding any entries.
- setup.py setup-db creates the DB cleanly given a valid
config_local.py with SQLITE_PATH set.
Added a 'Initialise pgAdmin config DB' step that runs setup-db
between config_local.py creation and load-servers. With the fix
in place, the local replay completes all 5 UI smoke tests in
~27s — same shape as a normal local run.
CI run #2 result: 3 of 5 smoke specs passed (Register Server, Create Table, Create Function); the two edit-mode specs failed with 'openEditDialogViaApi: no child of type "table"/"function" found under selected node — does the database have any?' CI's fresh PG 16 starts with an empty postgres database — no user tables, no user functions. The smoke's openEditDialogViaApi helper looks for the first child of the requested type and opens Properties on it; with no children, it correctly throws. Locally this worked because my dev PG has objects in the test database. In CI we need to pre-create them. Added a 'Seed schema objects' step that runs a single psql with: - CREATE TABLE IF NOT EXISTS audit_smoke_table (id, name) - CREATE OR REPLACE FUNCTION audit_smoke_func() RETURNS int The 'IF NOT EXISTS' / 'OR REPLACE' make the step idempotent so re-runs on the same runner don't error. All other 14 jobs went green in the prior run; this should bring schemaview-ui-smoke green too.
13 lint warnings on the latest CI run, all about unused vars. Three
fixes:
1. catch (_e) → catch (10 sites)
ES2019 optional catch binding. The error is genuinely unused
in each site:
- canary.js, validation_canary.js: sendBeacon swallow
(documented browser quirk; silent telemetry path).
- audit_harness.js tryInstantiate fall-through: next
attempt's outcome replaces the error.
- audit_harness.js seedCollections / ADD_ROW / sequence
/ applyMutation: per-step skips already covered by other
passes; failure mode is captured via null returns or
skip-result fields.
- audit_harness.js auditSchema initial validate:
dispatch loop's own try/catch catches the same error
with the field path context.
- setup-jest.js AMD define + audit_harness import:
best-effort module loading; failures are expected for
non-SchemaView Jest workers.
2. catch → catch (e) WITH the error captured in fuzzBatchAgainst
initial validate (one site, audit_harness.js ~1263). When the
fuzzer's mount-time validate throws, the previous swallow
reported only 'initial validate threw'. fast-check's shrinker
would land on a minimal counterexample like (TableSchema,
create, [0, 1]) with no clue WHY. Now the reason includes
so the shrunk reproducer names
the real failure.
3. Removed dead code in batched_changed_paths.spec.jsx
buildState helper + the SchemaState import that fed it. Both
were leftovers from an earlier iteration that built
SchemaState directly; the spec switched to a Harness
component using useSchemaState end-to-end. The two test()
cases now reference Harness exclusively.
CI scoreboard:
Before this commit: 1289 tests pass, 13 lint warnings.
After: 1289 tests pass, 0 lint warnings, 0 errors.
Replaces previously-introduced lint suppressions with actual fixes:
1. ESLint config now declares the genuine project globals
----------------------------------------------------
pgAdmin (webpack ProvidePlugin global, used in bench-fixture.js)
and expect (Jest global, used at module scope in setup-jest.js's
afterEach) added to the eslintrc globals block. Three
eslint-disable-next-line no-undef markers removed in turn.
2. perf-bench directory ignore -> per-file disables
------------------------------------------------
Dropped the broad regression/perf-bench/ ignore from
.eslintrc.js. Replaced with file-level
/* eslint-disable no-console */ headers ONLY in the two specs
that actually need them (nested.spec.js, datagridview.spec.js).
audit-helpers.js uses only console.error which the rule
already allows — no disable needed.
File-level disables are more honest than a directory ignore:
a maintainer reading the spec sees consent for THAT file, not
a blanket carve-out hidden in the config.
3. perf.js: 5 line-level disables -> one block-level disable
----------------------------------------------------------
The dump() function uses console.table for tabular output —
the no-console rule's allowed warn/error/trace can't express
it. Switched 5 // eslint-disable-next-line no-console comments
to a single /* eslint-disable no-console */ /* eslint-enable */
block around the function body. Rest of perf.js still under
the rule.
4. Stale openTreeContextMenu helper removed
----------------------------------------
The directory ignore was hiding an unused-vars warning on
openTreeContextMenu in audit-helpers.js — leftover from before
I switched UI smoke to the JS tree API. Deleted (8 lines).
navigateToCatalogNodeViaApi + openCreateDialogViaApi / Edit
variant superseded it.
5. Empty catch block in nested.spec.js
----------------------------------
try { ... } catch {} -> catch { /* try the next candidate */ }
for the cancel-dialog candidate selector loop. The block is
intentionally empty; the comment now says so explicitly.
Net change:
Before: 11 eslint-disable directives + 1 directory ignore
+ 13 'warning' findings in CI (the original cleanup)
After: 2 eslint-disable markers total (perf.js dump block,
bench-fixture.js mount-log — both with explicit
comments explaining WHY), 0 directory ignores added
by this PR, 0 lint warnings, 0 errors.
All 1289 Jest tests still pass.
…y reset Three honest improvements over per-PR suppressions / shortcuts. 1. scripts/ directory ignore removed from .eslintrc.js ----------------------------------------------------- The 'scripts/' entry was added by this PR (commit d22a249) to hide codemod-register-schema.js from lint. With it gone, the codemod surfaces 8 real issues: - 1 string-quote violation (autofixed) - 7 console.* statements (CLI report mechanism — legitimate for a one-shot script; gated by a single file-level /* eslint-disable no-console */ marker with an explicit comment noting it's the script's output channel) Net change: 1 directory ignore + 0 disables → 0 directory ignores + 1 well-commented file-level disable. 2. auditNestedFields recurses through ALL depths ------------------------------------------------ Previously dispatched against scalars only 1 level into a nested-fieldset / nested-tab / inline-groups container. Real schemas chain group containers multiple levels deep — e.g. TableSchema's Storage block, IndexSchema's With block — and deep scalars went unexercised. New walkNestedScalars(rootSchema) generator yields every scalar reachable through the group hierarchy, bounded by MAX_NEST_DEPTH=6 (generous for production shapes, guards against pathological / cyclical schemas). Note: the walker doesn't prune within nested-fieldset (all fields in a group are always walked), so multi-level divergences aren't a likely bug class today. The value is coverage — dispatching on deep scalars in case a future walker change adds pruning, or in case a deep closure has its own bug. New synthetic test (3-level L1->L2->L3) verifies the recursion fires at every depth by dispatch count. 3. setup-jest: lazy mutation-counter capture via require.cache -------------------------------------------------------------- Pre-fix: setup-jest required audit_harness at module load on every Jest worker (including ~80% non-SchemaView workers). Module-load require chosen because requiring inside beforeEach trips the zustand-mock's top-level afterEach() registration. Post-fix: resolve the audit_harness path once at module load (require.resolve only — doesn't load the module), then check require.cache[path] inside beforeEach. If a SchemaView spec has already loaded the harness, capture _resetMutationCounter from the cached exports; otherwise no-op. Non-SchemaView workers do one property lookup per beforeEach and skip — no audit_harness in their require graph. Test scoreboard: Before: 1289 tests in 75s After: 1290 tests in 75s (+1 from the depth-3 recursion regression-lock) Perf re-benched post-rebase at 10 × 1000: Outer typing ON 60 / OFF 231 ms/key → 3.85x Inner typing ON 174 / OFF 351 ms/key → 2.02x Inner ADD_ROW ON 566 / OFF 696 ms → 1.23x PR description's 3.79x / 2.00x figures still accurate (slight improvement within run-to-run noise).
Two correctness/safety improvements surfaced by an aggressive
post-build review of this PR.
Iter 1: depDests now includes knownErrorPaths (LOW-severity gap)
--------------------------------------------------------------
updateOptions(changedPath, depDests) was called with depDests
built from { changedPath, extras, extras' deps }. The mustVisit
for validateSchema ALSO included _knownErrorPaths.values(), but
depDests passed to the options walker DID NOT.
Latent inconsistency: the options walker could prune a row that
previously had a validation error, miscomputing its options if
any closure read top.errors / row error state. The canary would
catch divergence here, but adding the paths is the correct fix.
Iter 3: per-pass try/catch so one closure crash doesn't bury 7 passes
-------------------------------------------------------------------
Pre-fix: a SINGLE outer try around all 8 audit passes. If any
pass threw a non-divergence error (closure crash on missing
nodeInfo etc.), the entire chain aborted — remaining passes
never ran. CompoundTriggerSchema, ViewSchema, SubscriptionSchema
were affected: 4 SKIP entries in the registered_schemas_audit
output where the schema was getting ZERO audit coverage.
Refactored to runPass(label, fn) wrapper that catches per-pass.
Divergences still propagate (the isDivergenceError check is
preserved). Non-divergence errors aggregate into a passSkips
array; the audit reports skip ONLY when zero passes contributed
dispatches — otherwise the schema gets PARTIAL coverage from the
passes that ran.
Result: 4 SKIPs → 2 SKIPs. CompoundTriggerSchema now gets
partial coverage (previously 0); ViewSchema and SubscriptionSchema
still skip because EVERY pass throws on those (genuine harness
fixture limitation, not a regression).
Test scoreboard:
Jest: 167 / 167 suites, 1290 / 1290 tests passing (unchanged
count — these are quality fixes, not new tests)
Audit: 261 (87 schemas × 3 modes) — 260 passing tests + 1
discovery (was 260 passing, 0 visible regression in
net count, but better partial-coverage shape)
Adds two confidence-building suites for the SchemaView incremental
walker on top of the original 5-dialog audit-smoke set:
audit-smoke-extended.spec.js (15 dialogs):
- Schema-level (10): View, MaterializedView, Sequence, Type, Domain,
Procedure, Aggregate, ForeignTable, Collation, FTS Configuration
- Server-level (2): Role, Tablespace
- Sub-catalog (2): Trigger, Index
- Function-like (1): Trigger Function
Each test opens the dialog with the canary's throw-on-divergence
flag enabled, waits for Name, closes, and asserts canary clean.
audit-visual-regression.spec.js (5 dialogs):
Snapshot-based regression for Edit Table, Create Function, Create
Type (composite default), Edit Role, Create Index. Catches CSS /
layout drift the walker canary can't see. Includes initial darwin
baselines + README documenting the master-baseline-then-diff flow
with explicit cross-OS / PG-version / browser-version caveats.
Helper changes (audit-helpers.js):
- navigateToServerCollectionViaApi for server-level nodes
- navigateToTableSubCollectionViaApi for sub-catalog nodes
- Expanded coll-X catalog mapping (mview, fts_configuration, etc.)
H1 — Vacuous canary assertion. expect(__INCREMENTAL_AUDIT__).toBe(true)
only verified the harness flag was set, NOT that the canary actually
walked. On a non-CANARY_BUILD bundle the canary is tree-shaken and the
spec passes silently. Added window.__canary_entry_count__ in
validation_canary.js (zero cost in prod — entire file is tree-shaken)
and new expectCanaryExecuted() asserts > 0 after each dialog open.
H2 — Close-button selector ambiguity. button:has-text("Close").first()
could match a transient toast or a lingering Register Server dialog
under race conditions. Scoped to
.dock-panel.dock-style-dialogs button[data-test="Close"].
H3 — Sub-collection nav error message. Clarified that an empty
'available:' list from openAndFind means public has no tables; the CI
seed step likely didn't run.
H4/H5 — README workflow. Original "capture on master then cherry-pick"
was unworkable because the spec is new to this PR (doesn't exist on
master). Rewrote with: (a) single-branch capture-and-commit flow, and
(b) a temp-branch-off-master pattern for true regression diffing.
M1 — Visual-diff threshold was dialed up to make CI green, not
calibrated. Replaced maxDiffPixelRatio: 0.005 (~4800 px on a typical
dialog) with maxDiffPixels: 100 — a count-based threshold based on
observed back-to-back drift (0-15 px) + 5x safety margin.
M3 — Helper try-finally so a missing Name textbox doesn't leave a
stale dialog over the tree for the next spec.
M5 — Visual specs auto-skip on non-darwin via test.beforeEach; the
committed darwin baselines guarantee-fail on Linux without this.
M6 — Removed misleading mask-CodeMirror comment block; specs don't
navigate to the SQL tab so the mask wasn't needed.
L1 — README kill command now uses pkill ... || true to avoid set -e
aborting on no-such-process.
L3 — Edit Role visual baseline masks Name + Comments fields so the
spec works against any install (different first-role names otherwise
guarantee a diff).
L4 — Removed dead Operators → coll-operator mapping (no spec uses it).
L5 — Weakened "(inherits deferred)" / "(amname deferred)" claims to
"(mount)" — the specs don't toggle the dropdowns that would actually
exercise the deferred-dep path.
NOT addressed (out of scope for this branch):
M2 — bootPage duplication between audit-smoke and the new spec
files. Refactor belongs on a dedicated cleanup branch.
H1 also applies to web/regression/perf-bench/audit-smoke.spec.js
on the parent (audit) branch — fix there belongs on PR pgadmin-org#10002.
- Counter delta check: expectCanaryExecuted now compares against a
baseline taken before the dialog opens (defends against future
Playwright page-reuse modes that would silently pass on a stale
leftover count).
- README: python -> python3 (only python3 is on PATH in common
environments including modern macOS dev boxes).
- Stale doc reference in audit-visual-regression.spec.js: pointed to
removed README section "Mitigation: capture baselines in CI itself";
now points to the actual current section name.
Acknowledged-as-follow-up (not addressed in this commit):
- Exclusion Constraint deferredDepChange not covered by smoke
(tree path is 4-level: table -> constraints group -> coll- ->
sub-node, needs a new helper).
- Mount-only smoke doesn't exercise the deferred-dep RUNTIME path
(would need specs that actually toggle amname / inherits).
Brings visual snapshot coverage in line with the 20-dialog UI smoke
set. Refactored the 5 existing specs onto three category helpers
(snapshotSchemaChild / snapshotServerChild / snapshotTableChild) so
adding the 15 new specs avoided 15x copy-paste.
New visual specs:
Schema-level (12 net new):
Create Table, Edit Function, Create View, Create MView,
Create Sequence, Create Domain, Create Procedure, Create Aggregate,
Create Foreign Table, Create Collation, Create FTS Configuration,
Create Trigger Function
Server-level (2):
Create Role, Create Tablespace
Sub-catalog (1):
Create Trigger
Existing specs (Edit Table, Create Function, Create Type, Edit Role,
Create Index) preserved; baselines re-captured under the new helper
flow to keep the 20-baseline set internally consistent.
Edit Function masks Name (env-specific first-function name).
Resource note: 20 sequential pgAdmin sessions can exhaust PG
max_connections. Restart pgAdmin between capture and verify runs
(documented in README-visual-regression.md).
…+ visual
C1 — README pointed to docs/scratch/2026-06-04-audit-mock-harness.md
which is untracked. Inlined the 2-sentence mock-pivot summary in the
README directly so the doc stands on its own.
C2 — Visual specs silently dropped any walker divergence: they called
installErrorRecorders but never asserted on the captured errors. A
real walker regression during a visual spec's dialog mount would slip
past visual diff if pixels happened to match. Added expectNoDivergence
inside each snapshotXChild helper after the screenshot.
C3 — bootPage in both new spec files dropped the
`expect(__INCREMENTAL_AUDIT__).toBe(true)` assertion from the original
audit-smoke.spec.js. Re-added in both files so a SPA reload between
goto and dialog open fails the test instead of silently disabling the
canary.
H2 — Helper error message referenced a CI seed step
`create_test_tables_function` that doesn't exist in any workflow.
Replaced with a real pointer ("the schemaview-ui-smoke job's pre-spec
seed step").
H3 — Edit Role / Edit Function visual masks used page-level
getByRole locators that could match unrelated overlays. Scoped to
dialogLocator(page).getByRole(...) so mask resolution stays inside
the dialog panel.
M3 — Section comment said "Schema-level dialogs (8 tests)"; actually 10.
M5 — README "Dialog coverage" table listed 5 dialogs; visual spec now
has 20. Replaced the per-dialog table with a 3-row category summary
plus a "Highlights worth knowing about" section.
Not addressed (out of scope or follow-up):
- M1 bootPage triplicate refactor (audit-smoke also has it; touching
parent branch's spec belongs on PR pgadmin-org#10002).
- M2 snapshotTableChild editMode (sub-catalog Edit isn't a stated
coverage goal yet).
- M4 Edit Function CI seed (the spec relies on a function existing;
same gap as table-existence-for-sub-catalog).
- H4 5-6 min runtime (performance; functionality is correct).
…on-level
Mount-only smoke only catches "dialog fails to construct" — the
narrowest possible regression. A walker bug that triggers on
ADD_ROW, tab-switch, or any cross-tab dep evaluation slips past
silently. Aggressive review flagged this honestly: the 15 smoke
specs were theater compared to the 5 original audit-smoke specs
which actually interact.
New exerciseDialog helper fires four real dispatch types after the
dialog mounts:
1. SET_VALUE on Name (top-level scalar fill)
2. tab-switch through (renders each tab's fields + initial
all visible tabs validate pass on the schema's other
collections)
3. ADD_ROW on the first (collection dispatcher; the most common
DataGridView in the failure surface for walker bugs because
dialog (if any) it adds an entry the validator has to
re-walk)
4. SET_VALUE on the new (collection-row scalar; exercises
row's first cell per-row validation paths)
Best-effort: dialogs without Name (none, today) skip step 1;
dialogs without a DataGridView (Collation, Tablespace) skip steps
3-4. Validation errors from the typed values aren't confused with
canary divergences — expectNoDivergence filters on canary-specific
"Incremental walker divergence" / "Incremental validator
divergence" messages only.
Close handler now auto-accepts the "Discard changes?" confirm that
some dialogs pop after Name was mutated.
Runtime: ~1.3 min for all 15 specs (was ~1.0 min mount-only).
Renamed "(mount)" suffixes off Foreign Table and Index specs — no
longer accurate.
…ental mode
Edit-mode smoke specs (added in this branch) caught a real walker bug
in the options-tree incremental walker: when the prior walk's options
tree doesn't have an entry for a collection row (race with initial
walk's state propagation, or first dispatch after fresh data load in
Edit mode), the incremental prune at registry.js:253 would skip the
row AND leave nextColl[idx] undefined. Pruned rows are supposed to
inherit their prior reference via the earlier `{...prevColl}` spread,
but when prevColl[idx] is undefined there's nothing to inherit.
Symptom: walker canary divergence in 4 Edit-mode dialogs:
Edit Type — composite.0 collection
Edit Domain — constraints.0 + cell states
Edit Foreign Table — columns.0 + cell states
Edit Index — columns.0 + per-cell options (colname, nulls,
op_class, ...)
All four shared the same diagnostic pattern: incremental=undefined,
full={"canEditRow":true,"canDeleteRow":true}.
Fix: guard the prune on `prevColl[idx] !== undefined`. When there's
no prior to inherit, walk the row fully — that's the only way to
produce a non-undefined options entry for downstream consumers
(DataGridView's canEditRow / canDeleteRow checks, per-cell field
options). When prev IS populated, the prune still applies and
structural sharing is preserved.
Test updates:
- incremental_options.spec.js (unit tests for the walker) now do a
warm-up full walk to populate prev before exercising incremental
behaviour. Without the warm-up, the new guard correctly visits
all rows because none have prior entries to inherit — which is
the right runtime behaviour but the wrong shape for asserting
"incremental visits only the changed row." Switched 9 assertions
from `visitedRowIdxs` (FIELD_OPTIONS presence) to
`newlyVisitedRowIdxs` (reference comparison vs prev) — the
correct semantic for "this walk visited the row" once pruned
rows correctly inherit from prev.
Spec/seed/helper changes co-located in this commit:
- audit-smoke-extended.spec.js: 15 Edit-mode smoke specs added
(one per dialog type), exercising real dispatches via the
existing exerciseDialog helper. Coverage now 30/30 (15 Create +
15 Edit) on the extended dialog set.
- audit-helpers.js:
* openEditDialogViaApi: poll for children extended from 2s to
10s (sub-collection REST fetches are slower than top-level
catalog ones).
* navigateToTableSubCollectionViaApi: pin to `audit_smoke_table`
by name instead of "first table" — local environments had a
pre-existing XSS-payload-named table sorting first
alphabetically, with no triggers/indexes. Force-find the
CI-seeded table; emit a clear seed-step pointer if missing.
* tree.open() short-circuits on already-opened nodes; replaced
with ensureLoaded() to force the REST fetch.
* Stash the just-selected node on window.__pgadminLastNavigatedNode
so openEditDialogViaApi can recover from tree.selected() drift
(observed: pgAdmin's tree.selected() slips back to a parent
node after a tree-refresh event triggered by the sub-collection
REST fetch completing).
- .github/workflows/run-schemaview-ui-smoke.yml: extended the
Edit-mode seed step to provide all 13 newly-required object
types (view, mview, sequence, type, domain, procedure, aggregate,
foreign_table, collation, fts_configuration, trigger_function,
trigger, index).
pgAdmin's tree.selected() observably drifts to a parent node (typically the database) after a tree-refresh event fires when the selected collection's REST children land. Test code that does tree.select(node, true) followed by immediate dispatch via tree.selected() races against the refresh: sometimes the drift hasn't happened yet, sometimes it has — surfacing as random "no child of type X found" (openEditDialogViaApi) or "wait Name textbox timeout" (openCreateDialogViaApi) flakes depending on which spec hit the race. Validation runs with --workers=1 had a 30% iteration failure rate before this fix (3 of 10 iters had 1 spec fail). --workers=4 had a 50% iteration failure rate. Different specs failed each iteration — the flake was sequence-dependent, not spec-specific. Two-part fix: 1. Every navigation helper now stashes the just-selected node on `window.__pgadminLastNavigatedNode`. Previously only navigateToTableSubCollectionViaApi did this — adding the same stash to navigateToCatalogNodeViaApi and navigateToServerCollectionViaApi. 2. openCreateDialogViaApi now consults the stash first and falls back to tree.selected() only if no stash is set. Mirrors what openEditDialogViaApi already did. Post-fix validation runs: --workers=1: 10 iters × 30 specs = 300/300 pass, 0 flakes --workers=4: 10 iters × 30 specs = 300/300 pass, 0 flakes Production code is NOT affected: callers of tree.select() in production (server connect/disconnect callbacks, search-objects result-click, etc.) hold the node reference they passed and don't re-consume via tree.selected() afterward. The drift only surfaces in programmatic select-then-consume patterns used by tests.
…l-by-default
Three test-infrastructure improvements developed during the flake
investigation that culminated in the tree.selected() drift fix:
1. Per-test disconnect (audit-helpers: disconnectServerViaApi) —
afterEach hook fires DELETE /browser/server/connect/{gid}/{sid}
for each connected server in the test's session. Releases the
PG connection pool pgAdmin opened during the spec. Required
because pgAdmin's connection_manager is keyed by Flask session
(psycopg3 driver:78), so a worker that doesn't disconnect leaves
the pool open until session GC — 30 specs × N iterations would
otherwise exhaust PG max_connections. Includes the CSRF token
from window.pgAdmin so the DELETE isn't silently rejected.
Wired into audit-smoke-extended + audit-visual-regression.
2. AUDIT_DEBUG=1 instrumentation (audit-helpers: auditDebug, plus
markers in ensureServerRegistered + the three navigate helpers
+ the smoke spec's openAndAssertClean + exerciseDialog). Silent
by default. Surfaces phase-by-phase timing so flake
investigation can see WHICH phase stalled (page boot vs server
connect vs navigation vs dialog open vs interaction vs close)
without re-instrumenting code each time.
3. Default workers: 4 (playwright.config.js). Validated on this
branch with --workers=4 hitting 10 iters × 30 specs = 300/300
with zero flakes after the drift fix landed (separate commit).
--workers=8 also validated for one iteration (45s → 27s).
Default kept at 4 for safer CI margin against PG max_connections.
Also: idempotent installErrorRecorders (stashes on page.__errorRecorder
and clears on reuse) — defensive against a possible future shift to
worker-scoped page fixtures. ensureLoaded() instead of tree.open()
for sub-collection children to bypass aspen's "already-opened"
short-circuit when the previous walk cached the node.
Run-time impact (workers=4 default): 30-spec suite finishes in
~45s vs ~162s serial = 3.6x speedup. Combined with the drift fix,
zero flakes across 300 specs in validation.
Static review of the 10 cherry-picked commits surfaced a few items:
H2 — exerciseDialog's Name fill in Edit mode was a latent danger.
The CI seed creates fixtures by name (audit_smoke_table,
audit_smoke_view, ...) which subsequent Edit-mode specs in the same
iteration look up by name. The close path auto-clicks "Yes" on the
"Discard changes?" confirm — which is currently correct. But if a
future pgAdmin change ever shifts that prompt's verb (e.g.
"Save changes?" with Yes meaning Save), Name fills would rename
seeded fixtures and break every subsequent Edit-mode test in the
run. Added `editMode` parameter to exerciseDialog; skip the Name
fill entirely in Edit mode. Create mode is unaffected — the dialog
mounts with no persisted record, save would create a NEW object.
H3 — Comment for the cell-fill step claimed "first input that
became editable in the newly-added row" but in practice
`dialog.locator('table input').first()` resolves to the first
input anywhere in the dialog, typically the first cell of the
first grid (NOT necessarily the newly added row). Updated the
comment to reflect actual behavior; the test still exercises a
SET_VALUE dispatch in a grid which is what matters for the walker
canary.
L1 — Stale comment in installErrorRecorders referenced a worker-
scoped page fixture that was dropped during the flake
investigation. Rewrote to describe the actual idempotency rationale.
L2 — Stale comment in audit-visual-regression about restarting
pgAdmin between capture/verify is no longer accurate — the
per-test afterEach disconnect (also added in this branch) handles
the connection cleanup. Updated.
L3 — Comment cited `test.use({ page: 'serial' })` which isn't a
real Playwright option. Corrected to the actual API
(`test.describe.configure({ mode: 'serial' })`).
Verified after fixes:
- lint clean
- audit-smoke (5+30): 35/35 pass
- audit-visual-regression: 20/20 pass
- Jest unit suite: 1290/1290 pass
Items NOT addressed (acceptable in current architecture):
- H1: dialogLocator mask resolved lazily — works correctly under
test-scoped page fixture; would need revisiting if worker-scoped
is reintroduced.
- H4: window.__pgadminLastNavigatedNode global stash not cleared
after read — works correctly because each test gets a fresh
window; documented contract.
Visual baselines now live under `audit-visual-regression.spec.js-snapshots/<platform>/` instead of the previous flat layout with `-darwin` filename suffix. The unconditional `process.platform !== 'darwin'` skip is replaced with "skip if no baselines exist for this platform" — so linux gets coverage as soon as someone runs `--update-snapshots` on linux and commits the resulting `linux/` directory. snapshotPathTemplate in playwright.config.js wires it together. Existing darwin baselines moved (via `git mv`) from `<name>-darwin.png` to `darwin/<name>.png`. All 20 still match in the new path. Also: exerciseDialog's cell-fill step now targets the LAST row of the dialog's tbody (the row just added by ADD_ROW), not the first input anywhere. DataGridView appends new rows at the end of tbody, so `tbody tr:last-of-type input:first-of-type` is the right target. Best-effort still — many grids render the first cell as a typeahead/dropdown that swallows the fill. Verified after the move: - lint clean - audit-smoke-extended: 30/30 pass - audit-visual-regression: 20/20 pass on darwin
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Incremental schema walker + safety net
Introduces an incremental schema walker (O(visited) instead of O(total)
field re-evaluation per keystroke — 2–4× faster on 500–1000-row
dialogs), flips it on globally by default, and ships the safety
infrastructure to make that flip safe.
The walker prunes via
mustVisit = changedPath ∪ depDests(changedPath) ∪ knownErrorPaths+ structural sharing ofunvisited subtrees. Bug class: closures reading sibling rows
without declaring
field.depssee stale data once pruning is on.This PR makes that detectable in CI.
What lands
1 — Walker prototype
Incremental
schemaOptionsEvalulator+validateSchema. Structural sharing of unvisited collection rows, mustVisit derived fromchangedPath ∪ depDests(changedPath) ∪ knownErrorPaths. Initial opt-in per schema viaincrementalOptions = true; window-level__INCREMENTAL_OPTIONS__flag for canarying. Initial flip targets: Table, Column, Partition, Index, Domain. Six parent-row dep declarations added.2 — Deferred-protocol & DataGridView hardening
Documents the
deferredDepChangeprotocol (always-settling Promises, recovery callbacks, side-effect placement). Migrates Table / ForeignTable / ExclusionConstraint / Index / AzureCred schemas to the protocol. Drain useEffect ref-equality fix, queue accumulation, DepListener prefix-match protection, six small DataGridView typo / staleness / className fixes. Plus a deterministic race test for the drain.3 — Divergence canary
Two side-by-side walkers that diff their output and
console.error(or throw under tests) on mismatch. Build-time gated via webpack DefinePlugin substitutingprocess.env.__CANARY_BUILD__to a literal boolean; in default production builds the canary module is DCE'd entirely. Verified by a CI script (see #7).4 — Schema registry + audit harness
registerSchema(Class)self-registration, ESLint rule, AST codemod that migrated 86 default-exported schemas in one shot. The audit harness (auditSchema) walks every registered schema with the canary on. Multi-path error tracking (_knownErrorPaths) gives incremental validation production-equivalent safety semantics. ADD_ROW / DELETE_ROW / MOVE_ROW / BULK_UPDATE / nested-fieldset dispatch coverage, 3-row collections with per-row sentinels for chained reads, 6-variant scalar mutations (empty / whitespace / unicode / long), exhaustive k-combination batches (k=2..4) with all-k primary rotations, persisted-prev10-step sequence pass, fast-check property-based fuzzing on top.5 — UI smoke (Playwright)
5 dialogs against a real pgAdmin instance with the canary's throw flag on: Register Server, Create Table, Create Function, Edit Table, Edit Function. Tree navigation via pgAdmin's JS tree API (the DOM-based path was brittle against react-aspen virtualization).
6 — Production-flip hardening
Three real bugs surfaced by the audit / smoke + their fixes:
listenDepChangesregistered NO listener for evaluator-only deps. Fields declaringdepsbut nodepChangecallback had no entry in the listener registry, so_collectDepDestsForPathcouldn't include them in mustVisit — pruned silently. Fixed: register a listener for every declared dep regardless of callback presence.state.__lastChangedPathoverwritten by React batching. Two siblingfixedRowspromises (VacuumSettingsSchema'svacuum_table+vacuum_toast) resolving in one microtask tick: the secondsetUnpreparedDatadispatch clobbered the first's path; validate ran with only one path; the walker pruned the other collection's newly-arrived rows. Fixed:__pendingChangedPathsaccumulator; validate() consumes the full set.drainDeferredQueuebypassed the listener wrapper. RawsessDispatchfor every resolveddeferredDepChange— tripped the bypass guard AND dropped the path from the accumulator. Fixed: route viasessDispatchWithListener.Plus the bypass guard itself: reducer fires
console.error(which CI's setup-jestexpect(console.error).not.toHaveBeenCalled()afterEach catches) on any path-bearing action that reaches the reducer without the__viaListenersentinel. Canary-only — production tree-shakes it.LRU cap (1024) on
_knownErrorPathswith eviction telemetry. Edit-mode audit data seed (idAttribute+ populated text fields). Thickerschema.statestub. Developer guide atweb/pgadmin/static/js/SchemaView/README.md.7 — CI workflows
check-canary-treeshake.yml: builds NON-canary bundle, runsscripts/verify-canary-treeshake.sh. Fails if any ofrunOptionsCanary,runValidationCanary,formatDivergence,applyAllowlist,__throw_on_canary_divergence__appear inapp.bundle.js.run-schemaview-ui-smoke.yml: PG 16, seeds a server viasetup.py load-servers, pre-creates a test table + function for edit-mode coverage, builds withCANARY_BUILD=true, starts pgAdmin, runs Playwright. Uploads server log + Playwright trace on failure.8 — Jest infra cleanups
Mocks for 5 ESM-only deps (
react-data-grid,react-dnd,react-dnd-html5-backend,react-resize-detector,marked) + global AMDdefine()stub. Unblocked 16 audit-blocked schemas and 10 pre-existing failing Jest suites along the way.9 — Default-on flip
The walker is enabled globally by default — opt-out via
incrementalOptions = falseon the schema instance (only used by tests). A newINCREMENTAL_OFF=1env var on the perf-bench harness lets the bench replay the old slow-walk path for same-session A/B comparison.Test scoreboard
Suites:
Tests:
Audit coverage:
UI smoke:
Per-keystroke latency at 10 outer × 1000 inner rows (synthetic
nested.spec.js, same-session A/B withINCREMENTAL_OFF=1):Branch contents
This branch was originally split as
dev/incremental-audit(safety net) +dev/incremental-flip(the actual default-on flip), then combined into one PR for reviewer convenience. The combined branch is 6 logical commits + 1 merge + 2 CI workflow commits sincemaster:dev/incremental-flipcarrying the default-on flip + perf-benchINCREMENTAL_OFFenv)