Skip to content

⚡️ Speed up function find_cycle_vertices by 214% #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: async
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Aug 20, 2025

📄 214% (2.14x) speedup for find_cycle_vertices in src/dsa/nodes.py

⏱️ Runtime : 48.9 milliseconds 15.6 milliseconds (best of 170 runs)

📝 Explanation and details

The optimized code replaces the expensive nx.simple_cycles() call with nx.strongly_connected_components(), delivering a 214% speedup by fundamentally changing the algorithm approach.

Key Optimization:

  • Original: Enumerates all simple cycles explicitly using nx.simple_cycles() - computationally expensive as it must find and traverse every possible cycle path
  • Optimized: Uses strongly connected components (SCCs) to identify cycle vertices - leverages Tarjan's algorithm which runs in O(V+E) time

Why This Works:
A vertex participates in a cycle if and only if:

  1. It's in an SCC with multiple vertices (multi-vertex cycles), OR
  2. It's in a single-vertex SCC with a self-loop

Performance Analysis:
From the line profiler, the original spends 89.4% of time in nx.simple_cycles(), while the optimized version distributes work across SCC analysis (65.5%) and component processing. The SCC approach scales much better - it processes components once rather than enumerating all possible cycle paths.

Test Case Performance:

  • Best gains on complex graphs with overlapping cycles (410-521% faster) where cycle enumeration is most expensive
  • Consistent speedup across all cycle types: simple cycles (241-267% faster), disconnected cycles (310% faster), large single cycles (435-438% faster)
  • One exception: Large graphs with many self-loops show 34% slower performance due to the overhead of checking graph.has_edge(vertex, vertex) for each single-vertex SCC

The optimization is particularly effective for real-world graphs with complex cycle structures where the original algorithm's cycle enumeration becomes prohibitively expensive.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 24 Passed
🌀 Generated Regression Tests 55 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_dsa_nodes.py::test_complex_graph 151μs 35.9μs 321%✅
test_dsa_nodes.py::test_cycle_with_extra_nodes_edges 85.2μs 28.6μs 198%✅
test_dsa_nodes.py::test_figure_eight 120μs 23.5μs 411%✅
test_dsa_nodes.py::test_multiple_disjoint_cycles 100.0μs 24.2μs 312%✅
test_dsa_nodes.py::test_multiple_overlapping_cycles 123μs 23.9μs 417%✅
test_dsa_nodes.py::test_no_cycles_dag 34.3μs 19.6μs 75.3%✅
test_dsa_nodes.py::test_self_loop 22.1μs 14.6μs 51.0%✅
test_dsa_nodes.py::test_simple_triangle_cycle 69.4μs 18.8μs 270%✅
test_dsa_nodes.py::test_simple_two_node_cycle 59.3μs 17.2μs 244%✅
test_dsa_nodes.py::test_string_vertices 89.0μs 29.2μs 205%✅
🌀 Generated Regression Tests and Runtime
import networkx as nx  # for the function to test
# imports
import pytest  # used for our unit tests
from src.dsa.nodes import find_cycle_vertices

# unit tests

# 1. Basic Test Cases

def test_no_edges():
    # Graph with no edges and no nodes
    codeflash_output = find_cycle_vertices([]) # 18.6μs -> 9.88μs (88.6% faster)

def test_single_edge_no_cycle():
    # One edge, no cycle possible
    codeflash_output = find_cycle_vertices([(1, 2)]) # 28.5μs -> 16.1μs (77.2% faster)

def test_two_nodes_cycle():
    # Two nodes forming a cycle
    codeflash_output = find_cycle_vertices([(1, 2), (2, 1)]) # 58.2μs -> 17.0μs (241% faster)

def test_three_node_cycle():
    # Three nodes in a cycle
    codeflash_output = find_cycle_vertices([(1, 2), (2, 3), (3, 1)]) # 69.5μs -> 19.0μs (266% faster)

def test_three_node_path_no_cycle():
    # Three nodes in a path, no cycle
    codeflash_output = find_cycle_vertices([(1, 2), (2, 3)]) # 32.0μs -> 18.4μs (73.8% faster)

def test_disconnected_cycles():
    # Two disconnected cycles
    edges = [(1, 2), (2, 1), (3, 4), (4, 3)]
    codeflash_output = find_cycle_vertices(edges) # 87.2μs -> 21.2μs (310% faster)

def test_cycle_and_noncycle():
    # One cycle, one path
    edges = [(1, 2), (2, 3), (3, 1), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 75.0μs -> 23.2μs (223% faster)

def test_self_loop():
    # Single node with a self-loop
    codeflash_output = find_cycle_vertices([(1, 1)]) # 22.0μs -> 14.4μs (52.3% faster)

def test_multiple_self_loops():
    # Multiple nodes with self-loops
    edges = [(1, 1), (2, 2), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 32.2μs -> 21.2μs (52.4% faster)

def test_overlapping_cycles():
    # Overlapping cycles: 1-2-3-1 and 2-3-4-2
    edges = [(1, 2), (2, 3), (3, 1), (3, 4), (4, 2)]
    codeflash_output = find_cycle_vertices(edges) # 110μs -> 21.7μs (410% faster)

# 2. Edge Test Cases

def test_empty_graph():
    # No nodes, no edges
    codeflash_output = find_cycle_vertices([]) # 18.7μs -> 9.79μs (90.6% faster)

def test_single_node_no_edges():
    # One node, no edges
    codeflash_output = find_cycle_vertices([]) # 18.2μs -> 9.67μs (87.9% faster)

def test_single_node_self_loop():
    # One node, self-loop
    codeflash_output = find_cycle_vertices([(0, 0)]) # 22.1μs -> 14.6μs (51.0% faster)

def test_large_cycle():
    # Large cycle of 10 nodes
    edges = [(i, i+1) for i in range(10)]
    edges.append((10, 0))
    codeflash_output = find_cycle_vertices(edges) # 154μs -> 36.5μs (323% faster)

def test_cycle_with_tail():
    # Cycle with a tail node leading into it
    edges = [(0, 1), (1, 2), (2, 0), (3, 0)]
    codeflash_output = find_cycle_vertices(edges) # 73.1μs -> 21.5μs (239% faster)

def test_cycle_with_exit():
    # Cycle with an edge leaving the cycle
    edges = [(1, 2), (2, 3), (3, 1), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 73.1μs -> 21.4μs (241% faster)

def test_multiple_components_some_with_cycles():
    # Multiple components, some cyclic, some not
    edges = [(1, 2), (2, 3), (3, 1), (4, 5), (6, 7), (7, 6), (8, 9)]
    codeflash_output = find_cycle_vertices(edges) # 109μs -> 31.4μs (250% faster)

def test_duplicate_edges():
    # Duplicate edges in the input
    edges = [(1, 2), (2, 3), (3, 1), (1, 2), (2, 3)]
    codeflash_output = find_cycle_vertices(edges) # 69.3μs -> 20.1μs (245% faster)

def test_graph_with_isolated_nodes():
    # Nodes with no edges should not appear in output
    edges = [(1, 2), (2, 1)]
    # nodes 3, 4, 5 are isolated
    codeflash_output = find_cycle_vertices(edges) # 58.0μs -> 16.8μs (245% faster)

def test_graph_with_negative_and_zero_nodes():
    # Negative and zero as node labels
    edges = [(0, -1), (-1, -2), (-2, 0), (1, 2)]
    codeflash_output = find_cycle_vertices(edges) # 91.2μs -> 29.2μs (212% faster)

def test_graph_with_string_nodes():
    # Node labels are strings
    edges = [("a", "b"), ("b", "c"), ("c", "a"), ("d", "e")]
    codeflash_output = find_cycle_vertices(edges) # 80.3μs -> 24.0μs (235% faster)


def test_large_acyclic_graph():
    # Large DAG, should return empty list
    edges = [(i, i+1) for i in range(1000)]
    codeflash_output = find_cycle_vertices(edges) # 3.13ms -> 1.81ms (72.8% faster)

def test_large_single_cycle():
    # Large cycle of 1000 nodes
    edges = [(i, i+1) for i in range(999)]
    edges.append((999, 0))
    codeflash_output = find_cycle_vertices(edges) # 9.49ms -> 1.77ms (435% faster)

def test_large_graph_with_multiple_small_cycles():
    # 10 cycles of 10 nodes each, disconnected
    edges = []
    for k in range(10):
        base = k*10
        for i in range(10):
            edges.append((base+i, base+(i+1)%10))
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 1.08ms -> 200μs (436% faster)

def test_large_graph_with_cycles_and_paths():
    # 5 cycles of 10 nodes, and 50 node path
    edges = []
    for k in range(5):
        base = k*10
        for i in range(10):
            edges.append((base+i, base+(i+1)%10))
    # Add a path
    edges += [(100+i, 100+i+1) for i in range(49)]
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 721μs -> 205μs (252% faster)

def test_large_graph_sparse_cycles():
    # 100 cycles of 2 nodes each
    edges = []
    for i in range(0, 200, 2):
        edges.append((i, i+1))
        edges.append((i+1, i))
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 2.36ms -> 380μs (521% faster)

def test_large_graph_with_self_loops():
    # 500 nodes, each with a self-loop
    edges = [(i, i) for i in range(500)]
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 640μs -> 976μs (34.4% slower)

def test_large_graph_mixed():
    # 250 nodes in a single large cycle, 250 nodes with self-loops, 250 node path
    edges = []
    # Large cycle
    for i in range(250):
        edges.append((i, (i+1)%250))
    # Self-loops
    for i in range(250, 500):
        edges.append((i, i))
    # Path
    for i in range(500, 749):
        edges.append((i, i+1))
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 3.60ms -> 1.40ms (157% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import networkx as nx
# imports
import pytest  # used for our unit tests
from src.dsa.nodes import find_cycle_vertices

# unit tests

# ------------------------
# Basic Test Cases
# ------------------------

def test_no_edges_empty_graph():
    # No edges, no vertices, so no cycles
    codeflash_output = find_cycle_vertices([]) # 19.0μs -> 10.0μs (88.8% faster)

def test_single_vertex_no_cycle():
    # One node, no edges, no cycles
    codeflash_output = find_cycle_vertices([(1, 1)]) # 22.5μs -> 14.7μs (53.0% faster)

def test_two_node_cycle():
    # Simple 2-node cycle: 1->2->1
    codeflash_output = find_cycle_vertices([(1, 2), (2, 1)]) # 58.6μs -> 16.9μs (247% faster)

def test_three_node_cycle():
    # Simple 3-node cycle: 1->2->3->1
    codeflash_output = find_cycle_vertices([(1, 2), (2, 3), (3, 1)]) # 69.4μs -> 18.9μs (267% faster)

def test_disconnected_cycle_and_noncycle():
    # 1->2->3->1 is a cycle, 4->5 is not
    edges = [(1, 2), (2, 3), (3, 1), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 75.3μs -> 23.0μs (227% faster)

def test_multiple_disconnected_cycles():
    # Two cycles: 1->2->1 and 3->4->5->3
    edges = [(1, 2), (2, 1), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 96.8μs -> 23.6μs (310% faster)

def test_cycle_with_tail():
    # 1->2->3->1 is a cycle, 0->1 is a tail
    edges = [(0, 1), (1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges) # 72.2μs -> 21.5μs (236% faster)

def test_multiple_cycles_sharing_vertices():
    # 1->2->3->1 and 2->4->5->2
    edges = [(1, 2), (2, 3), (3, 1), (2, 4), (4, 5), (5, 2)]
    codeflash_output = find_cycle_vertices(edges) # 122μs -> 23.8μs (414% faster)

# ------------------------
# Edge Test Cases
# ------------------------

def test_self_loop():
    # Single self-loop
    codeflash_output = find_cycle_vertices([(42, 42)]) # 22.0μs -> 14.7μs (50.3% faster)

def test_multiple_self_loops():
    # Multiple self-loops, disconnected
    edges = [(1, 1), (2, 2), (3, 3)]
    codeflash_output = find_cycle_vertices(edges) # 25.2μs -> 19.3μs (30.4% faster)

def test_cycle_with_self_loop():
    # 1->2->3->1 is a cycle, 2->2 is a self-loop (should only appear once)
    edges = [(1, 2), (2, 3), (3, 1), (2, 2)]
    codeflash_output = find_cycle_vertices(edges) # 69.9μs -> 19.9μs (252% faster)

def test_cycle_with_non_participating_nodes():
    # 1->2->3->1 is a cycle, 4->5->6 is not
    edges = [(1, 2), (2, 3), (3, 1), (4, 5), (5, 6)]
    codeflash_output = find_cycle_vertices(edges) # 79.1μs -> 25.8μs (206% faster)

def test_empty_edges():
    # No edges at all
    codeflash_output = find_cycle_vertices([]) # 18.4μs -> 9.67μs (90.5% faster)

def test_cycle_with_duplicate_edges():
    # 1->2->3->1 is a cycle, with duplicate edges
    edges = [(1, 2), (2, 3), (3, 1), (1, 2), (2, 3)]
    codeflash_output = find_cycle_vertices(edges) # 69.6μs -> 19.8μs (252% faster)

def test_large_single_vertex_self_loop():
    # Large value vertex with self-loop
    codeflash_output = find_cycle_vertices([(999999, 999999)]) # 22.9μs -> 16.0μs (43.0% faster)

def test_cycle_with_isolated_vertex():
    # 1->2->3->1 is a cycle, 4 is isolated
    edges = [(1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges + [(4, 4)]) # 70.9μs -> 21.5μs (230% faster)

def test_cycle_with_non_integer_nodes():
    # Using string nodes
    edges = [("a", "b"), ("b", "c"), ("c", "a"), ("d", "e")]
    codeflash_output = find_cycle_vertices(edges) # 80.3μs -> 24.4μs (229% faster)


def test_multiple_cycles_with_overlap():
    # 1->2->3->1 and 3->4->5->3 (3 is shared)
    edges = [(1, 2), (2, 3), (3, 1), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 128μs -> 25.0μs (414% faster)

def test_cycle_with_branches():
    # 1->2->3->1 is a cycle, 2->4 is a branch
    edges = [(1, 2), (2, 3), (3, 1), (2, 4)]
    codeflash_output = find_cycle_vertices(edges) # 73.6μs -> 22.2μs (232% faster)

# ------------------------
# Large Scale Test Cases
# ------------------------

def test_large_acyclic_graph():
    # Large DAG: no cycles
    edges = [(i, i+1) for i in range(1000)]
    codeflash_output = find_cycle_vertices(edges) # 3.13ms -> 1.81ms (73.2% faster)

def test_large_single_cycle():
    # Large cycle: 0->1->2->...->999->0
    edges = [(i, (i+1)%1000) for i in range(1000)]
    codeflash_output = find_cycle_vertices(edges) # 9.50ms -> 1.77ms (438% faster)

def test_large_graph_with_multiple_small_cycles():
    # 10 cycles of length 10, disjoint
    edges = []
    for base in range(0, 100, 10):
        for i in range(base, base+10):
            edges.append((i, base + ((i-base+1)%10)))
    codeflash_output = find_cycle_vertices(edges) # 1.08ms -> 203μs (433% faster)

def test_large_graph_with_cycles_and_noncycles():
    # 500-node cycle, 500-node chain
    cycle_edges = [(i, (i+1)%500) for i in range(500)]
    chain_edges = [(i+500, i+501) for i in range(499)]
    edges = cycle_edges + chain_edges
    codeflash_output = find_cycle_vertices(edges) # 6.40ms -> 1.80ms (255% faster)

def test_large_graph_with_self_loops_and_cycles():
    # 100 self-loops, 100-node cycle
    self_loops = [(i, i) for i in range(100)]
    cycle_edges = [(100+i, 100+((i+1)%100)) for i in range(100)]
    edges = self_loops + cycle_edges
    codeflash_output = find_cycle_vertices(edges) # 1.13ms -> 389μs (191% faster)

# ------------------------
# Miscellaneous/Regression Tests
# ------------------------

def test_cycle_vertices_are_sorted():
    # Ensure output is sorted
    edges = [(3, 1), (1, 2), (2, 3)]
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 69.8μs -> 19.4μs (259% faster)

def test_multiple_cycles_with_duplicate_nodes():
    # 1->2->1 and 2->3->4->2, node 2 in both cycles
    edges = [(1, 2), (2, 1), (2, 3), (3, 4), (4, 2)]
    codeflash_output = find_cycle_vertices(edges) # 111μs -> 22.2μs (404% faster)

def test_large_sparse_graph_with_one_cycle():
    # 999 nodes in a chain, last three form a cycle
    edges = [(i, i+1) for i in range(997)] + [(997, 998), (998, 999), (999, 997)]
    codeflash_output = find_cycle_vertices(edges) # 3.14ms -> 1.80ms (74.4% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-find_cycle_vertices-mejgju1v and push.

Codeflash

The optimized code replaces the expensive `nx.simple_cycles()` call with `nx.strongly_connected_components()`, delivering a **214% speedup** by fundamentally changing the algorithm approach.

**Key Optimization:**
- **Original**: Enumerates all simple cycles explicitly using `nx.simple_cycles()` - computationally expensive as it must find and traverse every possible cycle path
- **Optimized**: Uses strongly connected components (SCCs) to identify cycle vertices - leverages Tarjan's algorithm which runs in O(V+E) time

**Why This Works:**
A vertex participates in a cycle if and only if:
1. It's in an SCC with multiple vertices (multi-vertex cycles), OR  
2. It's in a single-vertex SCC with a self-loop

**Performance Analysis:**
From the line profiler, the original spends 89.4% of time in `nx.simple_cycles()`, while the optimized version distributes work across SCC analysis (65.5%) and component processing. The SCC approach scales much better - it processes components once rather than enumerating all possible cycle paths.

**Test Case Performance:**
- **Best gains** on complex graphs with overlapping cycles (410-521% faster) where cycle enumeration is most expensive
- **Consistent speedup** across all cycle types: simple cycles (241-267% faster), disconnected cycles (310% faster), large single cycles (435-438% faster)
- **One exception**: Large graphs with many self-loops show 34% slower performance due to the overhead of checking `graph.has_edge(vertex, vertex)` for each single-vertex SCC

The optimization is particularly effective for real-world graphs with complex cycle structures where the original algorithm's cycle enumeration becomes prohibitively expensive.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Aug 20, 2025
@codeflash-ai codeflash-ai bot requested a review from KRRT7 August 20, 2025 04:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants