Skip to content

fix(rules): improve precision of 4 high-FP dotnet opengrep rules#63

Open
David Larsen (dc-larsen) wants to merge 4 commits intomainfrom
fix/dotnet-sast-rule-precision
Open

fix(rules): improve precision of 4 high-FP dotnet opengrep rules#63
David Larsen (dc-larsen) wants to merge 4 commits intomainfrom
fix/dotnet-sast-rule-precision

Conversation

@dc-larsen
Copy link
Copy Markdown
Contributor

@dc-larsen David Larsen (dc-larsen) commented Apr 11, 2026

Summary

Fixes 4 dotnet opengrep rules that produced 150 of 170 total false positives (88%) in a customer SAST evaluation, inflating the reported FP rate to 91%.

  • dotnet-xss-response-write: Converted to taint mode. Was matching any .Write() including Serilog ITextFormatter log sinks (74 FPs). Now tracks data flow from user input to Response.Write.
  • dotnet-hardcoded-credentials: Added value inspection and credential API patterns. Was matching variable names alone, flagging config paths like UseCaptchaOnResetPassword (31 FPs).
  • dotnet-crypto-failures: Rewrote to target weak algorithms (3DES/DES/RC2/RijndaelManaged). Was flagging Encoding.UTF8.GetBytes() which triggers on the recommended SHA256.HashData() pattern (30 FPs).
  • dotnet-path-traversal: Converted to taint mode. Was matching all Path.Combine() calls including framework paths like _env.WebRootPath (15 FPs).

Benchmark data (NIST Juliet C# Test Suite)

Rule Before Precision After Precision Before Recall After Recall
xss-response-write 41.6% 100% 47.8% 24.3%
hardcoded-credentials 0% 100% 0% 3.6%
crypto-failures 36.7% 100% 51.4% 50.0%
path-traversal 0% 100% 0% 45.2%

All 4 rules achieve 100% precision (zero false positives) post-fix. Recall trade-offs are acceptable: taint-mode rules only fire when user input actually reaches the sink, which is the correct behavior for security analysis.

Customer impact

Eliminates all 150 FPs from these 4 rules. Remaining findings (36 total, 20 FP) produce a ~56% FP rate, consistent with pattern-matching SAST tools. Further tuning via community rules and per-language scoping can reduce this further.

Testing

  • opengrep --validate passes on full dotnet.yml (40 rules, 0 errors)
  • Ran fixed rules through opengrep v1.19.0 against NIST Juliet C# test cases (4,300+ files)
  • Results match between opengrep and semgrep (identical TP/FP/FN counts)
  • Verified zero false positives across all 4 fixed rules
  • pytest passes (139 tests, 0 failures)
  • End-to-end scan through Socket Basics pipeline on a .NET repo

@dc-larsen David Larsen (dc-larsen) requested a review from a team as a code owner April 11, 2026 04:22
@dc-larsen David Larsen (dc-larsen) changed the title fix(rules): improve precision of 4 high-FP dotnet Semgrep rules fix(rules): improve precision of 4 high-FP dotnet opengrep rules Apr 11, 2026
Addresses customer SAST evaluation feedback where 4 rules produced 150/170
false positives (88% of all FPs), inflating the reported FP rate to 91%.

Rules fixed:
- dotnet-xss-response-write: Convert to taint mode. Previously matched any
  .Write() call including Serilog ITextFormatter log sinks. Now requires
  data flow from user input sources to Response.Write sinks.
- dotnet-hardcoded-credentials: Add value inspection and credential API
  patterns. Previously matched on variable names alone, flagging config
  key paths like "UseCaptchaOnResetPassword".
- dotnet-crypto-failures: Target actual weak algorithms (3DES, DES, RC2,
  RijndaelManaged) instead of Encoding.UTF8.GetBytes() which flagged the
  recommended SHA256.HashData(Encoding.UTF8.GetBytes(...)) pattern.
- dotnet-path-traversal: Convert to taint mode. Previously matched all
  Path.Combine() calls including those using framework-provided paths
  like _env.WebRootPath.

Validated with opengrep v1.19.0 against NIST Juliet C# test suite:
  xss-response-write:    Prec 41.6% -> 100%, Recall 47.8% -> 24.3%
  hardcoded-credentials: Prec 0.0%  -> 100%, Recall 0.0%  -> 3.6%
  crypto-failures:       Prec 36.7% -> 100%, Recall 51.4% -> 50.0%
  path-traversal:        Prec 0.0%  -> 100%, Recall 0.0%  -> 45.2%
@dc-larsen David Larsen (dc-larsen) force-pushed the fix/dotnet-sast-rule-precision branch from 4958b6f to cdb7224 Compare April 11, 2026 11:36
@dc-larsen
Copy link
Copy Markdown
Contributor Author

David Larsen (dc-larsen) commented Apr 11, 2026

E2E Validation: dotnet rule precision (round 1)

Validated the updated rules against two intentionally vulnerable .NET repositories using opengrep v1.19.0 and the full socket-basics pipeline.

Test Targets

Repo Framework Description
the-most-vulnerable-dotnet-app .NET 10.0 Educational repo showcasing common .NET vulnerabilities
AspGoat ASP.NET Core 8.0 OWASP vulnerable ASP.NET Core app

Results

the-most-vulnerable-dotnet-app

Rule Before After Notes
dotnet-hardcoded-credentials 8 0 Correctly filters out empty-string DTO property defaults (Password = "")
dotnet-path-traversal 15 0 Eliminates static Path.Combine matches (framework paths, hardcoded strings)
dotnet-xss-response-write 0 0 Repo uses Response.WriteAsync() (ASP.NET Core), not classic Response.Write()
dotnet-crypto-failures 0 0 No weak crypto algorithms (3DES/DES/RC2/Rijndael) present

AspGoat

Rule Before After Notes
dotnet-hardcoded-credentials 1 1 True positive preserved: defaultPassword = "admin123"
dotnet-path-traversal 1 0 Taint sources cover classic ASP.NET; ASP.NET Core patterns addressed in round 2

Pipeline Integration

Check Status
opengrep connector loads Pass
.socket.facts.json output valid Pass
Socket API submission Pass
Alert metadata (CWE, OWASP, fix, references) Present and correct

Observation

The taint-mode rules correctly eliminate pattern-match false positives. Initial taint sources target classic ASP.NET WebForms (Request.QueryString, Request.Form). ASP.NET Core controller parameter binding ([FromQuery], [FromBody], IFormFile) and Response.WriteAsync coverage added in round 2.

@dc-larsen
Copy link
Copy Markdown
Contributor Author

David Larsen (dc-larsen) commented Apr 11, 2026

E2E Validation: ASP.NET Core coverage added (round 2)

Extended taint sources and sinks to cover ASP.NET Core patterns, then re-validated against both test repos.

Changes

Sources (both path-traversal and xss-response-write):

  • Controller parameter binding: [FromQuery], [FromBody], [FromRoute], [FromForm]
  • File upload: IFormFile.FileName, IFormFile.ContentType

Sinks (xss-response-write):

  • Response.WriteAsync(...), HttpContext.Response.WriteAsync(...), Html.Raw(...)

Sinks (path-traversal):

  • Fully-qualified System.IO.File.* variants (ReadAllText, WriteAllText, ReadAllBytes, WriteAllBytes, Exists, Open, Delete) for ASP.NET Core code that uses explicit namespace qualification

Results

the-most-vulnerable-dotnet-app

Rule Before After (round 2) Findings
dotnet-path-traversal 15 false positives 4 true positives [FromQuery] filenameFile.Exists/File.ReadAllText; [FromBody] requestFile.WriteAllText. All with dataflow traces.
dotnet-xss-response-write 0 1 true positive [FromQuery] urlUrlDecode → string concat → Response.WriteAsync. Dataflow trace present.
dotnet-hardcoded-credentials 8 false positives 0 Empty-string defaults correctly filtered.

AspGoat

Rule Before After (round 2) Findings
dotnet-path-traversal 1 false positive 1 true positive IFormFile.FileNamePath.Combinenew FileStream. Dataflow trace present.
dotnet-hardcoded-credentials 1 true positive 1 true positive defaultPassword = "admin123" preserved.

Validation

Check Result
opengrep --validate 0 errors, 40 rules
pytest 139 passed
Socket API (round 2 scans) All 7 findings visible in dashboard
False positives (target rules) 0 across both repos
True positives (target rules) 7 across both repos

…net rules

Add controller parameter binding sources ([FromQuery], [FromBody],
[FromRoute], [FromForm]) and IFormFile.FileName to path-traversal
and XSS taint rules. Add Response.WriteAsync and Html.Raw as XSS
sinks. Add fully-qualified System.IO.File.* sink variants for
ASP.NET Core code that uses explicit namespace qualification.

E2E tested against two vulnerable .NET repos: 7 true positives
found, zero false positives.
Copy link
Copy Markdown
Contributor

@lelia lelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David Larsen (@dc-larsen) Thanks for putting this together! Reducing 150 false positives from 4 rules is meaningful, and the move toward taint-style matching is directionally the right fix for the worst offenders here. I also validated the updated dotnet.yml on my end with opengrep --validate, and the config is syntactically valid.

I do see a few semantic issues worth tightening before merge - summarized here, but see my review comments for more details:

  1. dotnet-path-traversal: Path.GetFullPath(...) is currently treated as a sanitizer at socket_basics/rules/dotnet.yml:410
  2. dotnet-hardcoded-credentials: the variable-name regex got much narrower at socket_basics/rules/dotnet.yml:161
  3. dotnet-crypto-failures: one broad pattern still looks FP-prone at socket_basics/rules/dotnet.yml:805

Overall, I think the main idea in this PR is solid: move away from simple pattern matching where it was clearly overfiring, and add more context-aware logic. I’d just recommend tightening the three cases above so we don’t trade one class of false positives for a quieter set of false negatives or misclassifications.

Comment thread socket_basics/rules/dotnet.yml Outdated
Comment thread socket_basics/rules/dotnet.yml Outdated
Comment thread socket_basics/rules/dotnet.yml Outdated
- Remove Path.GetFullPath() as path-traversal sanitizer (normalizes but
  does not prevent traversal on its own)
- Broaden hardcoded-credentials variable regex to cover idiomatic C#
  naming: apiKey, connectionString, privateKey, accessKey, authToken
- Remove overly broad Base64 encoding pattern from crypto-failures
  (benign encoding/transport use generates noise)
- pattern: Path.GetFileName(...)
# Framework-provided base paths are safe sources, not sanitizers,
# but if the result is validated against a base we consider it sanitized
- pattern: $X.StartsWith($BASE)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is meant to be a sanitizer for validated paths, but in taint mode this pattern by itself likely only matches the boolean check expression, not the tainted path variable after the check. That means correctly validated code may still be flagged. If the goal is to treat the path as safe after a base-directory check, this probably needs side-effect/control-flow modeling rather than a plain sanitizer pattern, eg:

Suggested change
- pattern: $X.StartsWith($BASE)
- patterns: $X.StartsWith($BASE)
- pattern: $X.StartsWith($BASE)
- focus-metavariable: $X
- by-side-effect: true

This should make the intent a bit clearer: treat the checked path variable as sanitized after the base-path validation, instead of only matching the StartsWith(...) call expression.

@lelia
Copy link
Copy Markdown
Contributor

lelia commented Apr 20, 2026

David Larsen (@dc-larsen) Just one remaining tweak and we should be good to go 👍

The $X.StartsWith($BASE) sanitizer was matching the boolean expression
instead of marking the checked path variable as sanitized, so correctly
validated paths were still flagged as tainted.

Use focus-metavariable + by-side-effect so the sanitizer applies to $X
itself. Verified with a synthetic test case: scans of an unsanitized
File.ReadAllText still fire, but the same call guarded by
full.StartsWith("/var/data/") no longer does. Juliet CWE-23/36 results
unchanged at 432 findings (Juliet test cases do not exercise StartsWith
validation). opengrep --validate and pytest pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants