-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Shared: Provenance-based filtering of flow summaries #21051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Shared: Provenance-based filtering of flow summaries #21051
Conversation
a3e585d to
eb48820
Compare
1e946f8 to
30a0791
Compare
0fbea88 to
5a2881d
Compare
5a2881d to
a941f4a
Compare
bf632b3 to
c6383ff
Compare
9f81377 to
0057ae3
Compare
1933d1c to
72dfe9c
Compare
2d63aaa to
4060c02
Compare
Click to show differences in coveragejavaGenerated file changes for java
- `Apache Commons Collections <https://commons.apache.org/proper/commons-collections/>`_,"``org.apache.commons.collections``, ``org.apache.commons.collections4``",,1600,,,,,,,
+ `Apache Commons Collections <https://commons.apache.org/proper/commons-collections/>`_,"``org.apache.commons.collections``, ``org.apache.commons.collections4``",,1615,,,,,,,
- Java Standard Library,``java.*``,10,4628,260,99,,9,,,26
+ Java Standard Library,``java.*``,10,4629,260,99,,9,,,26
- `Spring <https://spring.io/>`_,``org.springframework.*``,46,492,143,26,,28,14,,35
+ `Spring <https://spring.io/>`_,``org.springframework.*``,46,494,143,26,,28,14,,35
- Totals,,363,26372,2681,404,16,134,33,1,409
+ Totals,,363,26390,2681,404,16,134,33,1,409
- java.util,48,2,1339,,,,,,,,,1,,,,,,,,,,,34,,,,3,,,,5,2,,1,2,,,,,,,,,,,,,,2,,,558,781
+ java.util,48,2,1340,,,,,,,,,1,,,,,,,,,,,34,,,,3,,,,5,2,,1,2,,,,,,,,,,,,,,2,,,558,782
- org.apache.commons.collections4,,,800,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,17,783
+ org.apache.commons.collections4,,,815,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,17,798
- org.springframework.web.util,,9,157,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,9,132,25
+ org.springframework.web.util,,9,159,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,9,134,25 |
Missing manual models were added using the following code added to `FlowSummaryImpl.qll`:
```ql
private predicate testsummaryElement(
Input::SummarizedCallableBase c, string namespace, string type, boolean subtypes, string name,
string signature, string ext, string originalInput, string originalOutput, string kind,
string provenance, string model, boolean isExact
) {
exists(string input, string output, Callable baseCallable |
summaryModel(namespace, type, subtypes, name, signature, ext, originalInput, originalOutput,
kind, provenance, model) and
baseCallable = interpretElement(namespace, type, subtypes, name, signature, ext, isExact) and
(
c.asCallable() = baseCallable and input = originalInput and output = originalOutput
or
correspondingKotlinParameterDefaultsArgSpec(baseCallable, c.asCallable(), originalInput,
input) and
correspondingKotlinParameterDefaultsArgSpec(baseCallable, c.asCallable(), originalOutput,
output)
)
)
}
private predicate testsummaryElement2(
string namespace, string type, boolean subtypes, string name, string signature, string ext,
string originalInput, string originalOutput, string kind, string provenance, string model
) {
exists(Input::SummarizedCallableBase c |
testsummaryElement(c, _, _, _, _, _, _, originalInput, originalOutput, kind, provenance,
model, false) and
testsummaryElement(c, namespace, type, subtypes, name, signature, ext, _, _, _, provenance,
_, true) and
not testsummaryElement(c, _, _, _, _, _, _, originalInput, originalOutput, kind, provenance,
_, true)
)
}
private string getAMissingManualModel() {
exists(
string namespace, string type, boolean subtypes, string name, string signature, string ext,
string originalInput, string originalOutput, string kind, string provenance, string model
|
testsummaryElement2(namespace, type, subtypes, name, signature, ext, originalInput,
originalOutput, kind, provenance, model) and
result =
"- [\"" + namespace + "\", \"" + type + "\", True, \"" + name + "\", \"" + signature +
"\", \"\", \"" + originalInput + "\", \"" + originalOutput + "\", \"" + kind + "\", \"" +
provenance + "\"]"
)
}
```
4060c02 to
b6764b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR aligns the logic across languages for how flow summaries are prioritized based on provenance (manual vs. generated) and exactness (direct match vs. inherited/overridden). The key changes implement a consistent filtering mechanism to determine which flow summaries are considered "relevant."
Changes:
- Introduced a unified
relevantSummarypredicate in the shared flow summary library that filters flow summaries based on provenance and exactness - Modified
SummarizedCallable.propagatesFlowto includeProvenance pandboolean isExactparameters, replacing separatehasProvenanceandhasExactModelpredicates - Applied the
::Rangepattern toSummarizedCallableclass across Swift, Rust, and Ruby (all languages except C++)
Reviewed changes
Copilot reviewed 131 out of 138 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| shared/dataflow/codeql/dataflow/internal/FlowSummaryImpl.qll | Core implementation of unified flow summary filtering logic and RelevantSummarizedCallable class |
| swift/ql/lib/codeql/swift/dataflow/FlowSummary.qll | Applied Range pattern and updated to use RelevantSummarizedCallable |
| swift/ql/lib/codeql/swift/dataflow/internal/FlowSummaryImpl.qll | Added callableFromSource predicate |
| swift/ql/lib/codeql/swift/dataflow/ExternalFlow.qll | Updated adapter to provide provenance and exactness information |
| rust/ql/lib/codeql/rust/dataflow/FlowSummary.qll | Applied Range pattern and updated to use RelevantSummarizedCallable |
| rust/ql/lib/codeql/rust/dataflow/internal/FlowSummaryImpl.qll | Added callableFromSource predicate |
| rust/ql/lib/codeql/rust/dataflow/internal/ModelsAsData.qll | Refactored to populate provenance/exactness in propagatesFlow instead of filtering afterward |
| rust/ql/lib/codeql/rust/frameworks/stdlib/*.model.yml | Added manual overrides for generated models to ensure proper precedence |
| ruby/ql/lib/codeql/ruby/dataflow/FlowSummary.qll | Applied Range pattern with backward compatibility wrapper |
| ruby/ql/lib/codeql/ruby/frameworks/**/*.qll | Updated all flow summary classes to extend SummarizedCallable::Range |
| ruby/ql/test/library-tests/dataflow/summaries/Summaries.ql | Updated test classes to use Range pattern |
| rust/ql/test/query-tests/security/**/*.expected | Updated test expectations reflecting removal of duplicate models |
| swift/ql/lib/change-notes/2026-01-16-summarized-callable.md | Change note documenting breaking API change |
| rust/ql/lib/change-notes/2026-01-16-summarized-callable.md | Change note documenting breaking API change |
| ruby/ql/lib/change-notes/2026-01-16-summarized-callable.md | Change note documenting breaking API change |
| rust/ql/test/library-tests/dataflow/models/models.ql | Updated to filter manual models in invalidSpecComponent query |
| python/ql/lib/semmle/python/frameworks/Flask.qll | Updated to use Range pattern |
| javascript/ql/lib/utils/test/InlineSummaries.qll | Updated to use Range pattern |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| --- | ||
| category: minorAnalysis | ||
| --- | ||
| * The predicate `SummarizedCallable.propagatesFlow` has been added the columns `Provenance p` and `boolean isExact`, and as a consequence the predicates `SummarizedCallable.hasProvenance` and `SummarizedCallable.hasExactModel` have been removed. |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammatical error: "has been added the columns" should be "has had the columns added" or "has been given the columns".
| --- | ||
| category: minorAnalysis | ||
| --- | ||
| * The predicate `SummarizedCallable.propagatesFlow` has been added the columns `Provenance p` and `boolean isExact`, and as a consequence the predicates `SummarizedCallable.hasProvenance` and `SummarizedCallable.hasExactModel` have been removed. |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammatical error: "has been added the columns" should be "has had the columns added" or "has been given the columns".
| --- | ||
| category: minorAnalysis | ||
| --- | ||
| * The predicate `SummarizedCallable.propagatesFlow` has been added the columns `Provenance p` and `boolean isExact`, and as a consequence the predicates `SummarizedCallable.hasProvenance` and `SummarizedCallable.hasExactModel` have been removed. |
Copilot
AI
Jan 16, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammatical error: "has been added the columns" should be "has had the columns added" or "has been given the columns".
| ) | ||
| or | ||
| exists( | ||
| SummarizedCallableImpl callable, DataFlow::CallNode call, SummaryComponentStack input, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SummarizedCallableImpl shouldn't be used here I guess, as the QLDoc says not to. I'm not totally sure what we should use instead. I guess just SummarizedCallable, as we should only consider relevant summarized callable.
This PR aligns the logic across languages for how flow summaries are prioritized based on provenance and exactness (that is, whether a model is defined directly for a function or for a function that is implemented/overridden).
A flow summary is considered relevant if:
Note that for dynamic languages we currently pretend that no source code is available for functions with flow summaries, so 3.(a) holds vacuously.
Points 2 and 3.c represent a change for e.g. Java, where we would previously union exact and inexact models, which meant that it was not possible to overrule inexact models. As a consequence, some inexact manual have been replicated. DCA for Java reports some lost
java/sensitive-logresults onapache_solr, but looking at those results, they all have flow paths of length > 150, so they are almost certainly false positives, and most likely a consequence of 3.c.In order for the logic to be defined in the shared flow summary library, I had to move provenance and exactness information into the
propagatesFlowpredicate, which is a breaking change.Lastly, I have applied the
::Rangepattern to theSummarizedCallableclass for all languages except C++, which currently does not expose this class. This means thatSummarizedCallable::Rangewill contain all flow summaries, whereasSummarizedCallablewill only contain relevant summaries.