Skip to content

Go: fix bad join order#21576

Open
owen-mc wants to merge 2 commits intogithub:mainfrom
owen-mc:go/fix-bad-join-order
Open

Go: fix bad join order#21576
owen-mc wants to merge 2 commits intogithub:mainfrom
owen-mc:go/fix-bad-join-order

Conversation

@owen-mc
Copy link
Contributor

@owen-mc owen-mc commented Mar 25, 2026

Previously the optimizer was, in some cases, making a table of the file corresponding to all locations and only then limiting to locations corresponding to diagnostics. This pragma forces it to start from a list of diagnostics, get the locations and then the files.

Previously the optimizer was, in some cases, making a table of the file
corresponding to all locations and only then limiting to locations
corresponding to diagnostics. This pragma forces it to start from a list
of diagnostics, get the locations and then the files.
@owen-mc owen-mc requested review from a team and Copilot March 25, 2026 14:17
@owen-mc owen-mc requested a review from a team as a code owner March 25, 2026 14:17
@owen-mc owen-mc added the no-change-note-required This PR does not need a change note label Mar 25, 2026
@github-actions github-actions bot added the Go label Mar 25, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Go extractor diagnostic reporting to steer the QL optimizer toward a more efficient join order when deriving the associated file for a diagnostic, avoiding large intermediate tables.

Changes:

  • Add pragma[only_bind_into](this) when computing Diagnostic.getFile() to drive evaluation from diagnostics → locations → files.

Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>
@owen-mc owen-mc requested a review from aschackmull March 26, 2026 09:50
@owen-mc
Copy link
Contributor Author

owen-mc commented Mar 26, 2026

I think you're right. We want to enforce that the location is bound before we get the corresponding file, which is what your suggestiong does. I always find these pragmas confusing. For the record the original RA was:

Pipeline standard for DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80@f1338zup was evaluated in 2 iterations totaling 23ms (delta sizes total: 7489).
        3361166  ~2%    {2} r1 = JOIN Files::File#0e2ffa35#prev_delta WITH locations_default_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.0
           7489  ~2%    {2}    | JOIN WITH diagnostics_50#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1
           7489  ~2%    {2}    | AND NOT `DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80#prev`(FIRST 2)
                        return r1

With your suggestion the RA is:

[2026-03-26 09:44:39] Evaluated non-recursive predicate _diagnostics_locations_default#join_rhs@2ef3455n in 2ms (size: 7490).
Evaluated relational algebra for predicate _diagnostics_locations_default#join_rhs@2ef3455n with tuple counts:
        7490  ~4%    {2} r1 = SCAN diagnostics OUTPUT In.5, In.0
        7490  ~0%    {2}    | JOIN WITH locations_default ON FIRST 1 OUTPUT Rhs.1, Lhs.1
                     return r1

...

Pipeline standard for DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80@a3b82z1h was evaluated in 2 iterations totaling 0ms (delta sizes total: 7489).
        7489  ~2%    {2} r1 = JOIN Files::File#0e2ffa35#prev_delta WITH _diagnostics_locations_default#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.0
        7489  ~2%    {2}    | AND NOT `DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80#prev`(FIRST 2)
                     return r1

@aschackmull
Copy link
Contributor

Woah! It's recursive?!

@owen-mc
Copy link
Contributor Author

owen-mc commented Mar 26, 2026

I don't understand that either. Here are the relevant parts of the RA:

Evaluated recursive predicate DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80@a3b82z1h in 0ms on iteration 1 (delta size: 0).
Evaluated relational algebra for predicate DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80@a3b82z1h on iteration 1 running pipeline base with tuple counts:
        0  ~0%    {2} r1 = EMPTY(unique entity, unique entity)
                  return r1


Evaluated recursive predicate DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80@a3b82z1h in 0ms on iteration 2 (delta size: 0).
Evaluated relational algebra for predicate DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80@a3b82z1h on iteration 2 running pipeline standard with tuple counts:
        0  ~0%    {2} r1 = JOIN Files::File#0e2ffa35#prev_delta WITH _diagnostics_locations_default#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.0
        0  ~0%    {2}    | AND NOT `DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80#prev`(FIRST 2)
                  return r1


Evaluated recursive predicate DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80@a3b82z1h in 0ms on iteration 4 (delta size: 7489).
Evaluated relational algebra for predicate DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80@a3b82z1h on iteration 4 running pipeline standard with tuple counts:
        7489  ~2%    {2} r1 = JOIN Files::File#0e2ffa35#prev_delta WITH _diagnostics_locations_default#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.0
        7489  ~2%    {2}    | AND NOT `DiagnosticsReporting::Diagnostic.getFile/0#dispred#c0981f80#prev`(FIRST 2)
                     return r1

If I'm reading that correctly it's recursive with Files::File.

@aschackmull
Copy link
Contributor

But the presumably "bad" join-order doesn't actually look that bad to me - was it taking non-trivial time to compute somewhere? Because otherwise I don't think there's anything to fix here.

@owen-mc
Copy link
Contributor Author

owen-mc commented Mar 26, 2026

No, it wasn't taking up any time. It was just showing up in the nightly DCA runs and seemed like an easy fix.

@aschackmull
Copy link
Contributor

Ah! It's being turned into a recursive predicate through magic that's inserted on AstNode.getAChild! So weird as it seems, the proper fix may just be a nomagic annotation on AstNode.getAChild.

@owen-mc
Copy link
Contributor Author

owen-mc commented Mar 26, 2026

I just tried it out and just adding that pragma doesn't get rid of the problem.

@aschackmull
Copy link
Contributor

Which predicate did you target? The following does the trick for me:

diff --git a/go/ql/lib/semmle/go/AST.qll b/go/ql/lib/semmle/go/AST.qll
index a0715f6ca0e..064b62d422c 100644
--- a/go/ql/lib/semmle/go/AST.qll
+++ b/go/ql/lib/semmle/go/AST.qll
@@ -30,6 +30,7 @@ class AstNode extends @node, Locatable {
   /**
    * Gets a child node of this node.
    */
+  pragma[nomagic]
   AstNode getAChild() { result = this.getChild(_) }
 
   /**

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Go no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants