fix(report): keep ReDoS distinct from generic DoS in SARIF class hash#662
Open
sean-kim05 wants to merge 1 commit into
Open
fix(report): keep ReDoS distinct from generic DoS in SARIF class hash#662sean-kim05 wants to merge 1 commit into
sean-kim05 wants to merge 1 commit into
Conversation
`_class_keyword` scans `_VULN_CLASS_KEYWORDS` in order and returns the first substring match, so a more specific class must precede any class whose name it contains — an invariant the surrounding comment spells out. "denial of service" was listed before "regex denial of service", so any finding titled "...Regex Denial of Service..." resolved to the generic "denial of service" class instead. For locationless (synthetic-anchored) findings this collapses the `synth_class` component of `_primary_fingerprint`, so a ReDoS finding and a generic DoS finding on the same CWE get the *same* `partialFingerprints.primaryLocationLineHash` — and a SARIF consumer such as GitHub code-scanning silently dedups one genuine finding away. Reorder so "regex denial of service" precedes "denial of service", and add regression tests at both the `_class_keyword` and `write_sarif` levels (the latter asserts the two findings no longer share a fingerprint).
Contributor
Greptile SummaryThis PR keeps ReDoS SARIF findings distinct from generic DoS findings.
Confidence Score: 5/5This looks safe to merge.
Important Files Changed
Reviews (1): Last reviewed commit: "fix(report): keep ReDoS distinct from ge..." | Re-trigger Greptile |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #669
What
_class_keyword(instrix/report/sarif.py) scans_VULN_CLASS_KEYWORDSin order and returns the first substring match. The comment above the list documents the resulting invariant:But
"denial of service"was listed before"regex denial of service". Since"denial of service"is a substring of"regex denial of service", any finding titled"…Regex Denial of Service…"matched the generic entry first and resolved to class"denial of service"— the exact "collapse distinct findings" failure the comment warns against.Why it matters
For locationless findings (anchored synthetically to
SECURITY.md), the class keyword is the only differentiator in_primary_fingerprint— it's appended as thesynth_class:component. So a ReDoS finding and a generic DoS finding on the same CWE end up with the samepartialFingerprints.primaryLocationLineHash. A SARIF consumer such as GitHub code-scanning keys alert identity onpartialFingerprints, so one of the two genuine findings gets silently deduplicated away.Fix
Reorder so
"regex denial of service"precedes"denial of service"(with a comment explaining why the order is load-bearing). One-line data change — no logic change.I also audited the rest of
_VULN_CLASS_KEYWORDSfor the same substring-shadowing hazard; this was the only violation (e.g."rate limiting"is already correctly ordered before"rate limit").Tests
Added two regression tests to
tests/test_sarif.py:test_class_keyword_prefers_specific_regex_dos_over_generic_dos— unit-level: the specific class is returned, and a genuine generic DoS still maps to"denial of service".test_write_sarif_regex_dos_not_merged_into_generic_dos— end-to-end: two locationless same-CWE findings (ReDoS + generic DoS) get distinct fingerprints.Verified RED→GREEN: with the reorder reverted, the end-to-end test fails with
assert 1 == 2(both findings collapse to a single fingerprint); with the fix it passes. Full suite:82 passed(the 2 pre-existing Windows-only failures intest_config_loader/test_local_sourcesare unrelated to this change).ruff format,ruff check, andmypy strix/report/sarif.pyall clean.