Skip to content

Conversation

@hvitved
Copy link
Contributor

@hvitved hvitved commented Dec 16, 2025

This PR aligns the logic across languages for how flow summaries are prioritized based on provenance and exactness (that is, whether a model is defined directly for a function or for a function that is implemented/overridden).

A flow summary is considered relevant if:

  1. It is manual exact model, or
  2. It is a manual inexact model and there is no exact manual (neutral) model, or
  3. It is a generated model and (a) there is no source code available for the modeled callable, (b) there is no manual (neutral) model, and (c) the model is inexact and there is no generated exact (neutral) model.

Note that for dynamic languages we currently pretend that no source code is available for functions with flow summaries, so 3.(a) holds vacuously.

Points 2 and 3.c represent a change for e.g. Java, where we would previously union exact and inexact models, which meant that it was not possible to overrule inexact models. As a consequence, some inexact manual have been replicated. DCA for Java reports some lost java/sensitive-log results on apache_solr, but looking at those results, they all have flow paths of length > 150, so they are almost certainly false positives, and most likely a consequence of 3.c.

In order for the logic to be defined in the shared flow summary library, I had to move provenance and exactness information into the propagatesFlow predicate, which is a breaking change.

Lastly, I have applied the ::Range pattern to the SummarizedCallable class for all languages except C++, which currently does not expose this class. This means that SummarizedCallable::Range will contain all flow summaries, whereas SummarizedCallable will only contain relevant summaries.

@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 3 times, most recently from a3e585d to eb48820 Compare December 17, 2025 19:45
@github-actions github-actions bot added the JS label Dec 18, 2025
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 1e946f8 to 30a0791 Compare December 18, 2025 10:06
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 3 times, most recently from 0fbea88 to 5a2881d Compare January 13, 2026 10:08
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 5a2881d to a941f4a Compare January 13, 2026 10:59
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 2 times, most recently from bf632b3 to c6383ff Compare January 13, 2026 13:36
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 2 times, most recently from 9f81377 to 0057ae3 Compare January 13, 2026 14:43
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 2 times, most recently from 1933d1c to 72dfe9c Compare January 14, 2026 08:30
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 2 times, most recently from 2d63aaa to 4060c02 Compare January 15, 2026 08:37
@github-actions
Copy link
Contributor

⚠️ The head of this PR and the base branch were compared for differences in the framework coverage reports. The generated reports are available in the artifacts of this workflow run. The differences will be picked up by the nightly job after the PR gets merged.

Click to show differences in coverage

java

Generated file changes for java

  • Changes to framework-coverage-java.rst:
-    `Apache Commons Collections <https://commons.apache.org/proper/commons-collections/>`_,"``org.apache.commons.collections``, ``org.apache.commons.collections4``",,1600,,,,,,,
+    `Apache Commons Collections <https://commons.apache.org/proper/commons-collections/>`_,"``org.apache.commons.collections``, ``org.apache.commons.collections4``",,1615,,,,,,,
-    Java Standard Library,``java.*``,10,4628,260,99,,9,,,26
+    Java Standard Library,``java.*``,10,4629,260,99,,9,,,26
-    `Spring <https://spring.io/>`_,``org.springframework.*``,46,492,143,26,,28,14,,35
+    `Spring <https://spring.io/>`_,``org.springframework.*``,46,494,143,26,,28,14,,35
-    Totals,,363,26372,2681,404,16,134,33,1,409
+    Totals,,363,26390,2681,404,16,134,33,1,409
  • Changes to framework-coverage-java.csv:
- java.util,48,2,1339,,,,,,,,,1,,,,,,,,,,,34,,,,3,,,,5,2,,1,2,,,,,,,,,,,,,,2,,,558,781
+ java.util,48,2,1340,,,,,,,,,1,,,,,,,,,,,34,,,,3,,,,5,2,,1,2,,,,,,,,,,,,,,2,,,558,782
- org.apache.commons.collections4,,,800,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,17,783
+ org.apache.commons.collections4,,,815,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,17,798
- org.springframework.web.util,,9,157,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,9,132,25
+ org.springframework.web.util,,9,159,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,9,134,25

Missing manual models were added using the following code added to `FlowSummaryImpl.qll`:

```ql
    private predicate testsummaryElement(
      Input::SummarizedCallableBase c, string namespace, string type, boolean subtypes, string name,
      string signature, string ext, string originalInput, string originalOutput, string kind,
      string provenance, string model, boolean isExact
    ) {
      exists(string input, string output, Callable baseCallable |
        summaryModel(namespace, type, subtypes, name, signature, ext, originalInput, originalOutput,
          kind, provenance, model) and
        baseCallable = interpretElement(namespace, type, subtypes, name, signature, ext, isExact) and
        (
          c.asCallable() = baseCallable and input = originalInput and output = originalOutput
          or
          correspondingKotlinParameterDefaultsArgSpec(baseCallable, c.asCallable(), originalInput,
            input) and
          correspondingKotlinParameterDefaultsArgSpec(baseCallable, c.asCallable(), originalOutput,
            output)
        )
      )
    }

    private predicate testsummaryElement2(
      string namespace, string type, boolean subtypes, string name, string signature, string ext,
      string originalInput, string originalOutput, string kind, string provenance, string model
    ) {
      exists(Input::SummarizedCallableBase c |
        testsummaryElement(c, _, _, _, _, _, _, originalInput, originalOutput, kind, provenance,
          model, false) and
        testsummaryElement(c, namespace, type, subtypes, name, signature, ext, _, _, _, provenance,
          _, true) and
        not testsummaryElement(c, _, _, _, _, _, _, originalInput, originalOutput, kind, provenance,
          _, true)
      )
    }

    private string getAMissingManualModel() {
      exists(
        string namespace, string type, boolean subtypes, string name, string signature, string ext,
        string originalInput, string originalOutput, string kind, string provenance, string model
      |
        testsummaryElement2(namespace, type, subtypes, name, signature, ext, originalInput,
          originalOutput, kind, provenance, model) and
        result =
          "- [\"" + namespace + "\", \"" + type + "\", True, \"" + name + "\", \"" + signature +
            "\", \"\", \"" + originalInput + "\", \"" + originalOutput + "\", \"" + kind + "\", \"" +
            provenance + "\"]"
      )
    }
```
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 4060c02 to b6764b2 Compare January 15, 2026 14:26
@hvitved hvitved marked this pull request as ready for review January 16, 2026 08:49
@hvitved hvitved requested review from a team as code owners January 16, 2026 08:49
Copilot AI review requested due to automatic review settings January 16, 2026 08:49
@hvitved hvitved requested review from a team as code owners January 16, 2026 08:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aligns the logic across languages for how flow summaries are prioritized based on provenance (manual vs. generated) and exactness (direct match vs. inherited/overridden). The key changes implement a consistent filtering mechanism to determine which flow summaries are considered "relevant."

Changes:

  • Introduced a unified relevantSummary predicate in the shared flow summary library that filters flow summaries based on provenance and exactness
  • Modified SummarizedCallable.propagatesFlow to include Provenance p and boolean isExact parameters, replacing separate hasProvenance and hasExactModel predicates
  • Applied the ::Range pattern to SummarizedCallable class across Swift, Rust, and Ruby (all languages except C++)

Reviewed changes

Copilot reviewed 131 out of 138 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
shared/dataflow/codeql/dataflow/internal/FlowSummaryImpl.qll Core implementation of unified flow summary filtering logic and RelevantSummarizedCallable class
swift/ql/lib/codeql/swift/dataflow/FlowSummary.qll Applied Range pattern and updated to use RelevantSummarizedCallable
swift/ql/lib/codeql/swift/dataflow/internal/FlowSummaryImpl.qll Added callableFromSource predicate
swift/ql/lib/codeql/swift/dataflow/ExternalFlow.qll Updated adapter to provide provenance and exactness information
rust/ql/lib/codeql/rust/dataflow/FlowSummary.qll Applied Range pattern and updated to use RelevantSummarizedCallable
rust/ql/lib/codeql/rust/dataflow/internal/FlowSummaryImpl.qll Added callableFromSource predicate
rust/ql/lib/codeql/rust/dataflow/internal/ModelsAsData.qll Refactored to populate provenance/exactness in propagatesFlow instead of filtering afterward
rust/ql/lib/codeql/rust/frameworks/stdlib/*.model.yml Added manual overrides for generated models to ensure proper precedence
ruby/ql/lib/codeql/ruby/dataflow/FlowSummary.qll Applied Range pattern with backward compatibility wrapper
ruby/ql/lib/codeql/ruby/frameworks/**/*.qll Updated all flow summary classes to extend SummarizedCallable::Range
ruby/ql/test/library-tests/dataflow/summaries/Summaries.ql Updated test classes to use Range pattern
rust/ql/test/query-tests/security/**/*.expected Updated test expectations reflecting removal of duplicate models
swift/ql/lib/change-notes/2026-01-16-summarized-callable.md Change note documenting breaking API change
rust/ql/lib/change-notes/2026-01-16-summarized-callable.md Change note documenting breaking API change
ruby/ql/lib/change-notes/2026-01-16-summarized-callable.md Change note documenting breaking API change
rust/ql/test/library-tests/dataflow/models/models.ql Updated to filter manual models in invalidSpecComponent query
python/ql/lib/semmle/python/frameworks/Flask.qll Updated to use Range pattern
javascript/ql/lib/utils/test/InlineSummaries.qll Updated to use Range pattern

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

---
category: minorAnalysis
---
* The predicate `SummarizedCallable.propagatesFlow` has been added the columns `Provenance p` and `boolean isExact`, and as a consequence the predicates `SummarizedCallable.hasProvenance` and `SummarizedCallable.hasExactModel` have been removed.
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error: "has been added the columns" should be "has had the columns added" or "has been given the columns".

Copilot uses AI. Check for mistakes.
---
category: minorAnalysis
---
* The predicate `SummarizedCallable.propagatesFlow` has been added the columns `Provenance p` and `boolean isExact`, and as a consequence the predicates `SummarizedCallable.hasProvenance` and `SummarizedCallable.hasExactModel` have been removed.
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error: "has been added the columns" should be "has had the columns added" or "has been given the columns".

Copilot uses AI. Check for mistakes.
---
category: minorAnalysis
---
* The predicate `SummarizedCallable.propagatesFlow` has been added the columns `Provenance p` and `boolean isExact`, and as a consequence the predicates `SummarizedCallable.hasProvenance` and `SummarizedCallable.hasExactModel` have been removed.
Copy link

Copilot AI Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammatical error: "has been added the columns" should be "has had the columns added" or "has been given the columns".

Copilot uses AI. Check for mistakes.
)
or
exists(
SummarizedCallableImpl callable, DataFlow::CallNode call, SummaryComponentStack input,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SummarizedCallableImpl shouldn't be used here I guess, as the QLDoc says not to. I'm not totally sure what we should use instead. I guess just SummarizedCallable, as we should only consider relevant summarized callable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants