Skip to content

fix(report): keep ReDoS distinct from generic DoS in SARIF class hash#662

Open
sean-kim05 wants to merge 1 commit into
usestrix:mainfrom
sean-kim05:fix/sarif-redos-class-order
Open

fix(report): keep ReDoS distinct from generic DoS in SARIF class hash#662
sean-kim05 wants to merge 1 commit into
usestrix:mainfrom
sean-kim05:fix/sarif-redos-class-order

Conversation

@sean-kim05

@sean-kim05 sean-kim05 commented Jul 3, 2026

Copy link
Copy Markdown

Closes #669

What

_class_keyword (in strix/report/sarif.py) scans _VULN_CLASS_KEYWORDS in order and returns the first substring match. The comment above the list documents the resulting invariant:

Order matters — first match wins, so precise terms come before fuzzy ones … a future maintainer adding sloppy entries could collapse distinct findings to the same class hash.

But "denial of service" was listed before "regex denial of service". Since "denial of service" is a substring of "regex denial of service", any finding titled "…Regex Denial of Service…" matched the generic entry first and resolved to class "denial of service" — the exact "collapse distinct findings" failure the comment warns against.

Why it matters

For locationless findings (anchored synthetically to SECURITY.md), the class keyword is the only differentiator in _primary_fingerprint — it's appended as the synth_class: component. So a ReDoS finding and a generic DoS finding on the same CWE end up with the same partialFingerprints.primaryLocationLineHash. A SARIF consumer such as GitHub code-scanning keys alert identity on partialFingerprints, so one of the two genuine findings gets silently deduplicated away.

Fix

Reorder so "regex denial of service" precedes "denial of service" (with a comment explaining why the order is load-bearing). One-line data change — no logic change.

I also audited the rest of _VULN_CLASS_KEYWORDS for the same substring-shadowing hazard; this was the only violation (e.g. "rate limiting" is already correctly ordered before "rate limit").

Tests

Added two regression tests to tests/test_sarif.py:

  • test_class_keyword_prefers_specific_regex_dos_over_generic_dos — unit-level: the specific class is returned, and a genuine generic DoS still maps to "denial of service".
  • test_write_sarif_regex_dos_not_merged_into_generic_dos — end-to-end: two locationless same-CWE findings (ReDoS + generic DoS) get distinct fingerprints.

Verified RED→GREEN: with the reorder reverted, the end-to-end test fails with assert 1 == 2 (both findings collapse to a single fingerprint); with the fix it passes. Full suite: 82 passed (the 2 pre-existing Windows-only failures in test_config_loader/test_local_sources are unrelated to this change). ruff format, ruff check, and mypy strix/report/sarif.py all clean.

`_class_keyword` scans `_VULN_CLASS_KEYWORDS` in order and returns the
first substring match, so a more specific class must precede any class
whose name it contains — an invariant the surrounding comment spells out.
"denial of service" was listed before "regex denial of service", so any
finding titled "...Regex Denial of Service..." resolved to the generic
"denial of service" class instead.

For locationless (synthetic-anchored) findings this collapses the
`synth_class` component of `_primary_fingerprint`, so a ReDoS finding and
a generic DoS finding on the same CWE get the *same*
`partialFingerprints.primaryLocationLineHash` — and a SARIF consumer such
as GitHub code-scanning silently dedups one genuine finding away.

Reorder so "regex denial of service" precedes "denial of service", and
add regression tests at both the `_class_keyword` and `write_sarif`
levels (the latter asserts the two findings no longer share a
fingerprint).
@greptile-apps

greptile-apps Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR keeps ReDoS SARIF findings distinct from generic DoS findings.

  • Reorders the vulnerability keyword list so the specific ReDoS phrase matches first.
  • Adds a comment explaining the first-match ordering requirement.
  • Adds tests for direct keyword selection and synthetic SARIF fingerprint separation.

Confidence Score: 5/5

This looks safe to merge.

  • No blocking issues found in the changed code.

Important Files Changed

Filename Overview
strix/report/sarif.py Moves regex denial of service before the broader denial of service keyword to preserve the intended first-match behavior.
tests/test_sarif.py Adds tests that cover the ReDoS keyword choice and the end-to-end SARIF fingerprint split for locationless findings.

Reviews (1): Last reviewed commit: "fix(report): keep ReDoS distinct from ge..." | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] SARIF: "regex denial of service" shadowed by "denial of service", collapsing distinct fingerprints

1 participant