Skip to content

perf: Add pytest-benchmark suite and PR regression gate#1

Closed
JacksonWeber wants to merge 1 commit into
mainfrom
perf/regression-gate
Closed

perf: Add pytest-benchmark suite and PR regression gate#1
JacksonWeber wants to merge 1 commit into
mainfrom
perf/regression-gate

Conversation

@JacksonWeber
Copy link
Copy Markdown
Owner

@JacksonWeber JacksonWeber commented May 23, 2026

Adds a pytest-benchmark perf suite plus a CI workflow that gates PRs against perf regressions, inspired by microsoft/ApplicationInsights-node.js#1500.

Scenarios ( ests/perf/test_overhead.py)

Test Gating What it measures
est_azure_monitor_span yes configure_azure_monitor + tracer.start_as_current_span
est_azure_monitor_log yes configure_azure_monitor + logger.info
est_otel_span no Plain opentelemetry-sdk TracerProvider reference
est_otel_log` no Plain opentelemetry-sdk LoggerProvider reference

The Azure-Monitor pair shares a session-scoped fixture that calls configure_azure_monitor once with all network-facing features disabled (live metrics, perf counters, offline storage). The OTel pair builds its own TracerProvider / LoggerProvider directly so it isn't perturbed by the global mutation. Non-gating scenarios are reported but never fail CI.

Skipped by default

pyproject.toml sets addopts = "--benchmark-skip" so a normal pytest invocation skips the perf tests entirely. The perf workflow opts in with --benchmark-only.

CI (.github/workflows/performance.yml)

  1. Install the PR distro, run pytest tests/perf --benchmark-only --benchmark-json=pr.json.
  2. Check out the base branch, install it, repeat → base.json.
  3. perf/compare.py produces a markdown report and exits non-zero if any gating scenario regresses by more than PERF_REGRESSION_THRESHOLD percent (default 15).
  4. Sticky PR comment via marocchino/sticky-pull-request-comment@v2; JSON + report uploaded as artifacts.

Local usage

pip install -e . && pip install -r dev_requirements.txt
pytest tests/perf --benchmark-only --benchmark-json=pr.json
python -m perf.compare --baseline base.json --candidate pr.json --threshold 15

Verified locally

  • pytest tests/perf --benchmark-only runs all 4 benchmarks and writes a valid JSON.
  • perf/compare.py exits 0 on equal results and 1 when a gating scenario's median is doubled.
  • pytest tests/perf with no flags reports 4 skipped — confirms the default-skip works.

Adds a small pytest-benchmark suite that measures the overhead of the
distro on top of upstream OpenTelemetry, and a CI workflow that fails the
PR (and posts a sticky comment) when a gating scenario regresses by more
than PERF_REGRESSION_THRESHOLD percent (default 15%).

Scenarios (tests/perf/test_overhead.py):

- test_azure_monitor_span / test_azure_monitor_log (gating): configure_azure_monitor
  with network exporters disabled, then exercise the OTel API hot path.
- test_otel_span / test_otel_log (informational): plain opentelemetry-sdk
  TracerProvider/LoggerProvider reference using in-memory exporters.

The benchmarks are skipped by default via addopts = --benchmark-skip in
pyproject.toml; the performance workflow opts in with --benchmark-only.

perf/compare.py consumes the pytest-benchmark JSON, emits a markdown
comparison table, and exits non-zero on regression. The workflow runs the
suite once with the PR code installed and once with the base branch
installed, then compares.

Inspired by microsoft/ApplicationInsights-node.js#1500.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant