Skip to content

Add Confluence Data Center PAT detector#4886

Open
amanfcp wants to merge 4 commits intomainfrom
INS-400
Open

Add Confluence Data Center PAT detector#4886
amanfcp wants to merge 4 commits intomainfrom
INS-400

Conversation

@amanfcp
Copy link
Copy Markdown
Contributor

@amanfcp amanfcp commented Apr 14, 2026

Summary

Adds a detector for Confluence Data Center Personal Access Tokens.

  • No branded prefix. DC PATs are plain 44-char base64. To avoid drowning in false positives, the detector base64-decodes each candidate and checks for the <numeric_id>:<random_bytes> structural shape at the byte level
  • URL regex captures http?://host(:port)?, not just https://. On-prem Confluence commonly runs plain HTTP inside corporate networks and on non-standard ports (:8090, :8443).
  • Token-only emission. When no URL is found in the chunk, we still emit an unverified result.

Testing

  • Unit tests: pattern coverage, verification coverage via gock.
  • No CI integration test. Confluence Data Center requires a paid Atlassian license.
  • Manual end-to-end: stood up Confluence DC locally via the official Docker image using Atlassian's time-bomb licenses for testing server apps, created and verified valid PATs.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

Note

Medium Risk
Adds a new detector that extracts and optionally verifies Confluence Data Center PATs via live HTTP calls to on-prem URLs, which may impact scan behavior (false positives/volume and network verification) if the patterns or endpoint pairing are off.

Overview
Adds a new ConfluenceDataCenter detector that finds 44-char base64 Confluence Data Center PATs using keyword-scoped regex plus a base64 structural decode check, and pairs them with nearby self-hosted http(s)://host(:port) URLs (or emits token-only results when no URL is present).

Implements optional verification by calling GET /rest/api/user/current with Bearer auth and caches unreachable hosts to avoid repeated lookups; includes unit tests for matching/URL pairing and verification status handling, and wires the detector into defaults and the DetectorType enum (proto/detector_type.proto, generated detector_type.pb.go, and engine defaults/tests).

Reviewed by Cursor Bugbot for commit 02eeee7. Bugbot is set up for automated code reviews on this repo. Configure here.

@amanfcp amanfcp requested a review from a team April 14, 2026 17:16
@amanfcp amanfcp requested review from a team as code owners April 14, 2026 17:16
Comment thread pkg/detectors/confluencedatacenter/confluencedatacenter.go Outdated
Comment thread pkg/detectors/confluencedatacenter/confluencedatacenter.go
Comment thread pkg/detectors/confluencedatacenter/confluencedatacenter.go Outdated
Comment thread pkg/detectors/confluencedatacenter/confluencedatacenter.go
Comment thread pkg/detectors/confluencedatacenter/confluencedatacenter.go Outdated
Comment thread pkg/detectors/confluencedatacenter/confluencedatacenter.go Outdated
Copy link
Copy Markdown
Contributor

@mustansir14 mustansir14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. I have some questions/suggestions which you can look into.

Also it seems the credential pattern for this is the same as Jira Data Center, so there may be some overlap in results. I guess that's okay?

@amanfcp amanfcp requested a review from a team as a code owner April 15, 2026 11:49
@amanfcp
Copy link
Copy Markdown
Contributor Author

amanfcp commented Apr 16, 2026

image image

The detector does not appear in the corpora test results

@amanfcp
Copy link
Copy Markdown
Contributor Author

amanfcp commented Apr 17, 2026

Waiting on #4872 to get merged, will resolve conflicts then.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 02eeee7. Configure here.

validPAT2 = "NDc4MjM3OTUxMzk2OopoSkTDTnBcWIw0Wa4bico9zOLK"
// 44-char base64 that decodes to bytes NOT matching "<digits>:...". Used
// to exercise the structural post-filter.
nonStructural = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test token doesn't exercise structural post-filter as claimed

Medium Severity

The nonStructural constant "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" starts with A, but tokenPat requires [MNO] as the first character. This token is rejected by the regex itself, so isStructuralPAT is never called. The test named "structural post-filter rejects non-PAT base64" passes for the wrong reason (regex rejection, not structural filter rejection), leaving the false return path of isStructuralPAT for regex-matching candidates completely untested. The nonStructural value needs to start with M, N, or O to actually reach and exercise the structural check.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 02eeee7. Configure here.

Copy link
Copy Markdown
Contributor

@rosecodym rosecodym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please tell me if I've got this right: If we find only a single candidate URL in a chunk, and that URL doesn't resolve, we return all findings as determinately unverified. This means that something like a transient DNS error could cause determinate unverification even though for our other detectors it doesn't do that.

I don't say this because I see an obvious way around it - doing so seems like it would need some sort of heuristic analysis, which would be new ground for us. I just want to ensure I understand the current implementation.

(I also left a non-blocking note about a possible optimization, which I'll leave up to your discretion.

Comment on lines +93 to +95
if isStructuralPAT(m[1]) {
uniqueTokens[m[1]] = struct{}{}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how expensive or hot this code is, but it looks like you could avoid some work by checking for set presence before checking the match's structure, right? (If it's in the set, you don't need to check to see whether it's a PAT.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants