Optimize Gap Analysis by Pruning Gap Analysis Traversal Using Tiered Neo4j Queries to Improve Performance #716

PRAteek-singHWY · 2026-01-15T07:47:48Z

🚀 Prune Gap Analysis Search to Save Time and Memory

This PR implements a tiered pruning strategy for Gap Analysis to significantly reduce execution time and memory usage during map analysis.
The change directly addresses Issue #506 and aligns with the original design discussion around stopping early when strong or medium links are found.

🧠 Problem

Gap analysis currently performs an expensive wildcard Neo4j traversal:

MATCH p = allShortestPaths((BaseStandard)-[*..20]-(CompareStandard))

This approach:

Traverses all relationship types
Generates a large number of weakly relevant paths
Consumes large amounts of memory
Can take days to complete on large datasets
Runs even when direct or strong links already exist

In practice, we are only interested in the strongest connections between standards.

✅ Solution: Tiered Pruning Strategy

The search is now executed in three tiers, with early exit once results are found.

Tier 1 – Strong Links

Executed first. If any paths are found, the search stops immediately.

Relationships included:

LINKED_TO
AUTOMATICALLY_LINKED_TO
SAME

These correspond to the strongest connections (penalty = 0) and include equivalence (SAME) relationships.

Tier 2 – Medium Links

Executed only if Tier 1 returns no results.

Relationships included:

LINKED_TO
AUTOMATICALLY_LINKED_TO
SAME
CONTAINS

This captures hierarchical relationships without falling back to a full wildcard traversal.

Tier 3 – Fallback (Wildcard)

Executed only if Tier 1 and Tier 2 return no paths.

[*..20]

This preserves existing behavior as a fallback to ensure no loss of coverage.

🧪 Testing

A new unit test has been added to verify pruning behavior:

Confirms that Tier 3 is not executed when Tier 1 returns results
Uses mocking to detect which Neo4j queries are executed
Protects against future regressions in pruning logic

Test command:

python3 -m unittest application/tests/gap_analysis_db_test.py

All existing gap analysis tests continue to pass.

📈 Impact

🚀 Major reduction in gap analysis runtime
🧠 Lower memory usage
🛑 Avoids computing unnecessary weak paths
🧩 Fully backward compatible
🎯 Directly addresses the performance concerns raised in Issue Prune map analysis search to save time and memory #506

🔗 Related Issue

Prune map analysis search to save time and memory
Fixes #506

📝 Notes

Path scoring logic is unchanged
Relationship semantics are preserved
This PR focuses strictly on backend query pruning
Frontend categorization changes are intentionally deferred to a follow-up PR (Stage 2)

- Introduce tiered gap analysis queries (strong → medium → wildcard) - Stop traversal early when strong or medium paths exist - Preserve existing scoring and semantics - Add unit test to verify Tier-3 traversal is skipped when not needed Fixes OWASP#506

PRAteek-singHWY mentioned this pull request Jan 15, 2026

feat(frontend): Refine Gap Analysis link strength categorization (Weak threshold 20 -> 7) #717

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize Gap Analysis by Pruning Gap Analysis Traversal Using Tiered Neo4j Queries to Improve Performance #716

Optimize Gap Analysis by Pruning Gap Analysis Traversal Using Tiered Neo4j Queries to Improve Performance #716

Uh oh!

PRAteek-singHWY commented Jan 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Optimize Gap Analysis by Pruning Gap Analysis Traversal Using Tiered Neo4j Queries to Improve Performance #716

Are you sure you want to change the base?

Optimize Gap Analysis by Pruning Gap Analysis Traversal Using Tiered Neo4j Queries to Improve Performance #716

Uh oh!

Conversation

PRAteek-singHWY commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 Prune Gap Analysis Search to Save Time and Memory

🧠 Problem

✅ Solution: Tiered Pruning Strategy

Tier 1 – Strong Links

Tier 2 – Medium Links

Tier 3 – Fallback (Wildcard)

🧪 Testing

📈 Impact

🔗 Related Issue

📝 Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PRAteek-singHWY commented Jan 15, 2026 •

edited

Loading