Skip to content

Conversation

@everaldorodrigo
Copy link
Contributor

@everaldorodrigo everaldorodrigo commented Dec 12, 2025

Summary

This PR refines the Pinnedness Algorithm used to select the starting node for multihop TRAPI→Dgraph queries. The fix ensures nodes with more-selective constraints (especially single IDs) are preferred.

Motivation

Previous pinnedness sometimes chose a less-constrained node when adjacency was dense.
Tests revealed cases where a node with an ID should have been the root but wasn’t.

The slow Dgraph query generated is below:

{
    q0_node_n2(func: eq(vF_category, "Disease")) @cascade(vF_id, ~vF_subject, ~vF_object) {
        expand(vF_Node)

        # Path 1: Disease (Subject) -> Phenotype (MONDO:0011705)
        out_edges_e1: ~vF_subject @filter(eq(vF_predicate_ancestors, "has_phenotype")) @cascade(vF_predicate, vF_object) {
            expand(vF_Edge) { vF_sources expand(vF_Source) }
            node_n0: vF_object @filter(eq(vF_id, "MONDO:0011705")) @cascade(vF_id) {
                expand(vF_Node)
            }
        }

        # Path 2: Chemical (Subject) -> Treatment (Edge) -> Disease (Object)
        in_edges_e0: ~vF_object @filter(eq(vF_predicate_ancestors, "treats_or_applied_or_studied_to_treat")) @cascade(vF_predicate, vF_subject) {
            expand(vF_Edge) { vF_sources expand(vF_Source) }
            node_n1: vF_subject @filter(eq(vF_category, "ChemicalEntity")) @cascade(vF_id) {
                expand(vF_Node)
            }
        }

        # Path 3: Phenotype (Subject) -> Phenotype (Edge) -> Disease (Object)
        in_edges_e2: ~vF_object @filter(eq(vF_predicate_ancestors, "has_phenotype")) @cascade(vF_predicate, vF_subject) {
            expand(vF_Edge) { vF_sources expand(vF_Source) }
            node_n0: vF_subject @filter(eq(vF_id, "MONDO:0011705")) @cascade(vF_id) {
                expand(vF_Node)
            }
        }
    }
}

The fixed Dgraph query generated is below:

{
    q0_node_n0(func: eq(vF_id, "MONDO:0011705")) @cascade(vF_id, ~vF_subject, ~vF_object) {
        expand(vF_Node)

        # Path 1: MONDO:0011705 is the OBJECT of 'has_phenotype' (i.e., Disease has MONDO:0011705)
        out_edges_e2: ~vF_subject @filter(eq(vF_predicate_ancestors, "has_phenotype")) @cascade(vF_predicate, vF_object) {
            expand(vF_Edge) { vF_sources expand(vF_Source) }
            node_n2: vF_object @filter(eq(vF_category, "Disease")) @cascade(vF_id, ~vF_object) {
                expand(vF_Node)
                # Find Chemical Treatments for this Disease
                in_edges_e0: ~vF_object @filter(eq(vF_predicate_ancestors, "treats_or_applied_or_studied_to_treat")) @cascade(vF_predicate, vF_subject) {
                    expand(vF_Edge) { vF_sources expand(vF_Source) }
                    node_n1: vF_subject @filter(eq(vF_category, "ChemicalEntity")) @cascade(vF_id) {
                        expand(vF_Node)
                    }
                }
            }
        }

        # Path 2: MONDO:0011705 is the SUBJECT of 'has_phenotype' (i.e., MONDO:0011705 has Disease as a phenotype)
        in_edges_e1: ~vF_object @filter(eq(vF_predicate_ancestors, "has_phenotype")) @cascade(vF_predicate, vF_subject) {
            expand(vF_Edge) { vF_sources expand(vF_Source) }
            node_n2: vF_subject @filter(eq(vF_category, "Disease")) @cascade(vF_id, ~vF_object) {
                expand(vF_Node)
                # Find Chemical Treatments for this Disease
                in_edges_e0: ~vF_object @filter(eq(vF_predicate_ancestors, "treats_or_applied_or_studied_to_treat")) @cascade(vF_predicate, vF_subject) {
                    expand(vF_Edge) { vF_sources expand(vF_Source) }
                    node_n1: vF_subject @filter(eq(vF_category, "ChemicalEntity")) @cascade(vF_id) {
                        expand(vF_Node)
                    }
                }
            }
        }
    }
}

@sentry
Copy link

sentry bot commented Dec 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
...c/retriever/data_tiers/tier_0/dgraph/transpiler.py 79.62% <100.00%> (+0.80%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@everaldorodrigo everaldorodrigo linked an issue Dec 12, 2025 that may be closed by this pull request
@everaldorodrigo everaldorodrigo changed the title fix(tier0): Fix Pinnedness Algorithm. fix(tier0): Fix Pinnedness Algorithm Dec 12, 2025
@tokebe tokebe marked this pull request as ready for review December 17, 2025 17:05
@tokebe tokebe merged commit a8a107c into main Dec 17, 2025
5 checks passed
@everaldorodrigo everaldorodrigo deleted the tier0-fix-pinnedness-algorithm branch December 17, 2025 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Certain tier 0 queries time out/socket closes

3 participants