⚡️ Speed up function find_cycle_vertices
by 215%
#26
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 215% (2.15x) speedup for
find_cycle_vertices
insrc/dsa/nodes.py
⏱️ Runtime :
80.6 milliseconds
→25.6 milliseconds
(best of80
runs)📝 Explanation and details
Let's break down the profiling results and focus on performance improvements.
Profiling Analysis
From your profiler.
graph = nx.DiGraph(edges)
takes 79.1% of the time.cycles = list(nx.simple_cycles(graph))
takes 20.7% of the time.So, graph construction from the edge list is the main bottleneck, followed by finding all cycles.
Step 1: Speed Up nx.DiGraph Construction
NetworkX can be slow for large graphs or when constructing from a dense edge list. However, if your edge list is already an efficient representation (tuples like
(u, v)
), there’s little to optimize with NetworkX itself.Suggestions.
edges
is a list or tuple (not a generator or slower structure).Alternative: Native Algorithms
If you only need cycle detection and the nodes, you could avoid NetworkX altogether for further speedup—replacing it with a native DFS (Tarjan's algorithm) or similar. But if NetworkX and its API must be retained, see below.
Step 2: Optimize Cycle Discovery
nx.simple_cycles(graph)
is the most efficient in-networkx (Johnson's algorithm); alternatives will likely add more code, but can be faster for certain graph densities.Step 3: Optimize Cycle Vertex Extraction
Instead of flattening all cycles, collect nodes in SCCs of size > 1 (and nodes with self-loops). This is faster than enumerating all cycles.
Fastest Solution
Let’s use strongly connected components. For each SCC.
Why is this faster?
Compatibility
Summary of Optimizations
nx.simple_cycles
→ much faster SCC analysis.If you are allowed to avoid NetworkX entirely, let me know for a native, even faster solution! This version, however, will give you a major speedup for graphs with cycles.
Full rewritten code.
This should deliver dramatic speed improvement over the original, especially for larger graphs!
✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
test_dsa_nodes.py::test_complex_graph
test_dsa_nodes.py::test_cycle_with_extra_nodes_edges
test_dsa_nodes.py::test_figure_eight
test_dsa_nodes.py::test_multiple_disjoint_cycles
test_dsa_nodes.py::test_multiple_overlapping_cycles
test_dsa_nodes.py::test_no_cycles_dag
test_dsa_nodes.py::test_self_loop
test_dsa_nodes.py::test_simple_triangle_cycle
test_dsa_nodes.py::test_simple_two_node_cycle
test_dsa_nodes.py::test_string_vertices
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-find_cycle_vertices-mc8pivnf
and push.