⚡️ Speed up function `find_cycle_vertices` by 215% #26

codeflash-ai · 2025-06-23T06:20:04Z

📄 215% (2.15x) speedup for `find_cycle_vertices` in `src/dsa/nodes.py`

⏱️ Runtime : 80.6 milliseconds → 25.6 milliseconds (best of 80 runs)

📝 Explanation and details

Let's break down the profiling results and focus on performance improvements.

Profiling Analysis

From your profiler.

graph = nx.DiGraph(edges) takes 79.1% of the time.
cycles = list(nx.simple_cycles(graph)) takes 20.7% of the time.
Other lines are negligible.

So, graph construction from the edge list is the main bottleneck, followed by finding all cycles.

Step 1: Speed Up nx.DiGraph Construction

NetworkX can be slow for large graphs or when constructing from a dense edge list. However, if your edge list is already an efficient representation (tuples like (u, v)), there’s little to optimize with NetworkX itself.

Suggestions.

If feasible, ensure edges is a list or tuple (not a generator or slower structure).
Avoid unnecessary copies and build the graph only from what's needed.

Alternative: Native Algorithms

If you only need cycle detection and the nodes, you could avoid NetworkX altogether for further speedup—replacing it with a native DFS (Tarjan's algorithm) or similar. But if NetworkX and its API must be retained, see below.

Step 2: Optimize Cycle Discovery

nx.simple_cycles(graph) is the most efficient in-networkx (Johnson's algorithm); alternatives will likely add more code, but can be faster for certain graph densities.
If you do not need to enumerate all cycles and only care about the nodes involved in any cycle, you could compute the strongly connected components (SCCs). Any SCC of size > 1, or with a self-loop, contains a cycle.

Step 3: Optimize Cycle Vertex Extraction

Instead of flattening all cycles, collect nodes in SCCs of size > 1 (and nodes with self-loops). This is faster than enumerating all cycles.

Fastest Solution

Let’s use strongly connected components. For each SCC.

If it contains more than one node, every node may participate in a cycle.
If it contains a single node, check for a self-loop.

Why is this faster?

No cycle enumeration needed: Strongly connected components can be found in linear time.
No flattening of all cycles: With SCCs, we grab all nodes at once.

Compatibility

Works with all versions of NetworkX.
No additional dependencies.
Preserves function signature and result.

Summary of Optimizations

Replaces slow call: nx.simple_cycles → much faster SCC analysis.
Minimal code change: Maintains maintainability.
No unnecessary flattening: Only collects nodes once.

If you are allowed to avoid NetworkX entirely, let me know for a native, even faster solution! This version, however, will give you a major speedup for graphs with cycles.

Full rewritten code.

This should deliver dramatic speed improvement over the original, especially for larger graphs!

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 26 Passed
🌀 Generated Regression Tests	✅ 52 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

⚙️ Existing Unit Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`test_dsa_nodes.py::test_complex_graph`	186μs	43.0μs	✅334%
`test_dsa_nodes.py::test_cycle_with_extra_nodes_edges`	109μs	36.5μs	✅200%
`test_dsa_nodes.py::test_figure_eight`	148μs	28.3μs	✅422%
`test_dsa_nodes.py::test_multiple_disjoint_cycles`	119μs	28.1μs	✅324%
`test_dsa_nodes.py::test_multiple_overlapping_cycles`	147μs	28.0μs	✅425%
`test_dsa_nodes.py::test_no_cycles_dag`	39.2μs	22.0μs	✅78.8%
`test_dsa_nodes.py::test_self_loop`	26.0μs	17.3μs	✅50.1%
`test_dsa_nodes.py::test_simple_triangle_cycle`	80.8μs	22.1μs	✅265%
`test_dsa_nodes.py::test_simple_two_node_cycle`	69.4μs	20.1μs	✅245%
`test_dsa_nodes.py::test_string_vertices`	103μs	34.0μs	✅204%

🌀 Generated Regression Tests and Runtime

import networkx as nx
# imports
import pytest  # used for our unit tests
from src.dsa.nodes import find_cycle_vertices

# unit tests

# ------------------------
# BASIC TEST CASES
# ------------------------

def test_no_edges():
    # No edges, no cycles
    codeflash_output = find_cycle_vertices([]) # 21.9μs -> 11.8μs (85.2% faster)

def test_no_cycles_linear_chain():
    # Linear chain: 1->2->3->4, no cycles
    edges = [(1, 2), (2, 3), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 43.0μs -> 24.2μs (77.3% faster)

def test_single_self_loop():
    # Single node with a self-loop
    edges = [(1, 1)]
    codeflash_output = find_cycle_vertices(edges) # 24.6μs -> 16.6μs (48.2% faster)

def test_simple_cycle():
    # Simple 3-node cycle: 1->2->3->1
    edges = [(1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges) # 82.5μs -> 21.9μs (277% faster)

def test_two_disjoint_cycles():
    # Two disjoint cycles: 1->2->1 and 3->4->5->3
    edges = [(1, 2), (2, 1), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 118μs -> 27.7μs (329% faster)

def test_cycle_with_tail():
    # 1->2->3->1 is a cycle, 4->2 is a tail
    edges = [(1, 2), (2, 3), (3, 1), (4, 2)]
    codeflash_output = find_cycle_vertices(edges) # 86.0μs -> 25.0μs (244% faster)

def test_cycle_with_branch():
    # 1->2->3->1 is a cycle, 2->4 is a branch
    edges = [(1, 2), (2, 3), (3, 1), (2, 4)]
    codeflash_output = find_cycle_vertices(edges) # 87.2μs -> 24.8μs (252% faster)

def test_multiple_overlapping_cycles():
    # 1->2->3->1 and 2->4->5->2
    edges = [(1, 2), (2, 3), (3, 1), (2, 4), (4, 5), (5, 2)]
    # All vertices except 4 and 5 are in at least one cycle
    codeflash_output = find_cycle_vertices(edges) # 149μs -> 28.1μs (433% faster)

def test_cycle_with_extra_edges():
    # 1->2->3->1 is a cycle, 2->4 and 4->5 (no cycle)
    edges = [(1, 2), (2, 3), (3, 1), (2, 4), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 91.7μs -> 27.7μs (231% faster)

# ------------------------
# EDGE TEST CASES
# ------------------------

def test_single_node_no_edges():
    # One node, no edges
    edges = []
    codeflash_output = find_cycle_vertices(edges) # 21.5μs -> 11.1μs (94.0% faster)

def test_single_node_self_loop():
    # One node with a self-loop
    edges = [(0, 0)]
    codeflash_output = find_cycle_vertices(edges) # 25.2μs -> 16.9μs (49.4% faster)

def test_disconnected_graph_with_and_without_cycles():
    # Two components: 1->2->1 (cycle), 3->4 (no cycle)
    edges = [(1, 2), (2, 1), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 76.9μs -> 24.5μs (214% faster)

def test_duplicate_edges_in_cycle():
    # 1->2->3->1, but 1->2 appears twice
    edges = [(1, 2), (1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges) # 81.0μs -> 22.3μs (263% faster)

def test_multiple_self_loops():
    # 1->1, 2->2, 3->4->5->3 (cycle)
    edges = [(1, 1), (2, 2), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 84.8μs -> 28.2μs (200% faster)

def test_cycle_with_noninteger_vertices():
    # Use string labels
    edges = [("a", "b"), ("b", "c"), ("c", "a"), ("d", "e")]
    codeflash_output = find_cycle_vertices(edges) # 94.2μs -> 28.7μs (229% faster)

def test_empty_graph():
    # No nodes, no edges
    codeflash_output = find_cycle_vertices([]) # 21.2μs -> 11.1μs (91.4% faster)

def test_cycle_with_negative_and_zero_vertices():
    # Vertices: -1->0->-1 (cycle), 1->2 (no cycle)
    edges = [(-1, 0), (0, -1), (1, 2)]
    codeflash_output = find_cycle_vertices(edges) # 76.2μs -> 25.2μs (202% faster)

def test_cycle_with_large_labels():
    # Large integer labels
    edges = [(1000000, 2000000), (2000000, 1000000)]
    codeflash_output = find_cycle_vertices(edges) # 73.8μs -> 22.0μs (235% faster)

def test_graph_with_no_cycles_but_many_edges():
    # DAG with many edges, no cycles
    edges = [(i, i+1) for i in range(10)]
    codeflash_output = find_cycle_vertices(edges) # 73.7μs -> 42.9μs (71.7% faster)

# ------------------------
# LARGE SCALE TEST CASES
# ------------------------

def test_large_cycle():
    # Large single cycle: 0->1->2->...->999->0
    n = 1000
    edges = [(i, (i+1)%n) for i in range(n)]
    codeflash_output = find_cycle_vertices(edges) # 12.1ms -> 2.27ms (435% faster)

def test_large_acyclic_graph():
    # Large DAG: 0->1->2->...->999, no cycles
    n = 1000
    edges = [(i, i+1) for i in range(n-1)]
    codeflash_output = find_cycle_vertices(edges) # 3.89ms -> 2.27ms (71.3% faster)

def test_large_graph_with_multiple_cycles_and_branches():
    # Two large cycles and some branches
    n = 500
    # First cycle: 0->1->...->499->0
    cycle1 = [(i, (i+1)%n) for i in range(n)]
    # Second cycle: 500->501->...->999->500
    cycle2 = [(i, i+1) for i in range(n, 2*n-1)] + [(2*n-1, n)]
    # Branches: 250->750, 400->800 (cross edges)
    branches = [(250, 750), (400, 800)]
    edges = cycle1 + cycle2 + branches
    expected = list(range(0, n)) + list(range(n, 2*n))
    codeflash_output = find_cycle_vertices(edges) # 12.4ms -> 2.27ms (445% faster)

def test_large_sparse_graph_with_self_loops():
    # Many nodes, only a few self-loops
    n = 1000
    edges = [(i, i) for i in range(0, n, 100)]
    codeflash_output = find_cycle_vertices(edges) # 45.9μs -> 48.1μs (4.59% slower)

def test_large_graph_with_overlapping_cycles():
    # 0->1->2->...->499->0 (cycle)
    # 250->500->750->250 (overlapping cycle)
    n = 1000
    edges = [(i, (i+1)%500) for i in range(500)]  # first cycle
    edges += [(250, 500), (500, 750), (750, 250)]  # second cycle
    expected = list(range(500)) + [250, 500, 750]
    # Remove duplicates for expected
    expected = sorted(set(expected))
    codeflash_output = find_cycle_vertices(edges) # 6.20ms -> 1.15ms (437% faster)

# ------------------------
# ADDITIONAL EDGE CASES
# ------------------------

def test_cycle_with_tuple_labels():
    # Vertices are tuples
    edges = [((1,2), (2,3)), ((2,3), (3,1)), ((3,1), (1,2))]
    codeflash_output = find_cycle_vertices(edges) # 92.0μs -> 26.0μs (254% faster)


def test_cycle_with_floats():
    # Vertices are floats
    edges = [(1.1, 2.2), (2.2, 1.1)]
    codeflash_output = find_cycle_vertices(edges) # 77.5μs -> 23.5μs (231% faster)

def test_cycle_with_bool_labels():
    # Vertices are booleans
    edges = [(True, False), (False, True)]
    codeflash_output = find_cycle_vertices(edges) # 71.3μs -> 20.3μs (251% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import random

# function to test
# derived from https://github.com/langflow-ai/langflow/pull/5262
import networkx as nx
# imports
import pytest  # used for our unit tests
from src.dsa.nodes import find_cycle_vertices

# unit tests

# ----------------------------
# 1. Basic Test Cases
# ----------------------------

def test_no_edges():
    # No edges, so no cycles
    codeflash_output = find_cycle_vertices([]) # 21.6μs -> 11.0μs (96.6% faster)

def test_single_self_loop():
    # One node with a self-loop forms a cycle
    codeflash_output = find_cycle_vertices([(1, 1)]) # 25.3μs -> 16.8μs (50.9% faster)

def test_two_node_cycle():
    # Two nodes forming a cycle: 1 -> 2 -> 1
    codeflash_output = find_cycle_vertices([(1, 2), (2, 1)]) # 67.8μs -> 19.4μs (249% faster)

def test_three_node_cycle():
    # Three nodes in a cycle: 1 -> 2 -> 3 -> 1
    codeflash_output = find_cycle_vertices([(1, 2), (2, 3), (3, 1)]) # 80.0μs -> 21.7μs (269% faster)

def test_disconnected_graph_with_cycle():
    # Disconnected graph: one component with a cycle, one without
    edges = [(1, 2), (2, 3), (3, 1), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 88.8μs -> 26.8μs (232% faster)

def test_disconnected_graph_no_cycle():
    # Disconnected graph with no cycles
    edges = [(1, 2), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 40.2μs -> 22.5μs (78.4% faster)

def test_multiple_cycles():
    # Two cycles: 1->2->3->1 and 4->5->4
    edges = [(1, 2), (2, 3), (3, 1), (4, 5), (5, 4)]
    codeflash_output = find_cycle_vertices(edges) # 117μs -> 27.5μs (325% faster)

def test_cycle_and_tail():
    # 1->2->3->1 is a cycle, 4->1 is a tail into the cycle
    edges = [(1, 2), (2, 3), (3, 1), (4, 1)]
    codeflash_output = find_cycle_vertices(edges) # 84.9μs -> 24.5μs (247% faster)

def test_multiple_cycles_with_shared_vertex():
    # 1->2->3->1 and 3->4->5->3
    edges = [(1, 2), (2, 3), (3, 1), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 146μs -> 27.8μs (429% faster)

# ----------------------------
# 2. Edge Test Cases
# ----------------------------

def test_empty_graph():
    # No nodes or edges
    codeflash_output = find_cycle_vertices([]) # 21.3μs -> 11.0μs (93.2% faster)

def test_single_node_no_self_loop():
    # One node, no edges
    codeflash_output = find_cycle_vertices([(1, 2)]) # 32.8μs -> 18.4μs (78.1% faster)

def test_multiple_self_loops():
    # Multiple nodes with self-loops
    edges = [(1, 1), (2, 2), (3, 3)]
    codeflash_output = find_cycle_vertices(edges) # 29.0μs -> 22.5μs (28.6% faster)

def test_parallel_edges():
    # Parallel edges between the same nodes, forming a 2-node cycle
    edges = [(1, 2), (1, 2), (2, 1)]
    codeflash_output = find_cycle_vertices(edges) # 68.1μs -> 19.9μs (242% faster)

def test_large_cycle_with_tail():
    # 1->2->3->4->5->1 is a cycle, 6->1 is a tail
    edges = [(1, 2), (2, 3), (3, 4), (4, 5), (5, 1), (6, 1)]
    codeflash_output = find_cycle_vertices(edges) # 112μs -> 30.0μs (274% faster)

def test_cycle_with_isolated_node():
    # 1->2->3->1 is a cycle, 4 is isolated
    edges = [(1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges) # 80.8μs -> 21.8μs (271% faster)

def test_cycle_with_duplicate_edges():
    # Duplicate edges in the cycle
    edges = [(1, 2), (2, 3), (3, 1), (1, 2)]
    codeflash_output = find_cycle_vertices(edges) # 80.8μs -> 22.3μs (262% faster)

def test_graph_with_no_cycles():
    # Directed acyclic graph (DAG)
    edges = [(1, 2), (2, 3), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 41.9μs -> 23.2μs (80.6% faster)

def test_cycle_with_non_integer_nodes():
    # Nodes are strings
    edges = [("a", "b"), ("b", "c"), ("c", "a")]
    codeflash_output = find_cycle_vertices(edges) # 84.8μs -> 23.4μs (263% faster)



def test_large_single_cycle():
    # Large cycle of 1000 nodes
    N = 1000
    edges = [(i, i+1) for i in range(N-1)] + [(N-1, 0)]
    codeflash_output = find_cycle_vertices(edges) # 12.1ms -> 2.26ms (434% faster)

def test_large_acyclic_graph():
    # Large DAG: 1000 nodes in a line
    N = 1000
    edges = [(i, i+1) for i in range(N-1)]
    codeflash_output = find_cycle_vertices(edges) # 3.91ms -> 2.28ms (71.2% faster)

def test_large_graph_with_multiple_cycles():
    # Two large cycles, each of 500 nodes, disjoint
    N = 500
    edges = (
        [(i, i+1) for i in range(N-1)] + [(N-1, 0)] +  # first cycle
        [(N+i, N+i+1) for i in range(N-1)] + [(2*N-1, N)]  # second cycle
    )
    codeflash_output = find_cycle_vertices(edges) # 12.3ms -> 2.27ms (442% faster)

def test_large_sparse_graph_with_few_cycles():
    # 1000 nodes, mostly sparse, with a small cycle
    N = 1000
    edges = [(i, i+1) for i in range(N-1)]  # chain
    # Add a small cycle at the end
    edges += [(997, 998), (998, 999), (999, 997)]
    expected = [997, 998, 999]
    codeflash_output = find_cycle_vertices(edges) # 3.95ms -> 2.28ms (73.4% faster)

def test_large_graph_with_self_loops():
    # 1000 nodes, each with a self-loop
    N = 1000
    edges = [(i, i) for i in range(N)]
    codeflash_output = find_cycle_vertices(edges) # 1.67ms -> 2.52ms (34.0% slower)

def test_large_graph_random_edges_no_cycles():
    # Randomly connect 1000 nodes, but ensure no cycles (DAG)
    N = 1000
    edges = [(i, i+1) for i in range(N-1)]
    # Shuffle edges to ensure no accidental cycle
    random.shuffle(edges)
    codeflash_output = find_cycle_vertices(edges) # 4.22ms -> 2.47ms (70.7% faster)

def test_large_graph_random_edges_with_cycles():
    # Randomly connect 1000 nodes, then add a cycle among the last 10 nodes
    N = 1000
    edges = [(i, i+1) for i in range(N-1)]
    # Add a cycle among the last 10 nodes
    cycle_nodes = list(range(N-10, N))
    for i in range(10):
        edges.append((cycle_nodes[i], cycle_nodes[(i+1)%10]))
    expected = list(range(N-10, N))
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 4.01ms -> 2.28ms (76.0% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-find_cycle_vertices-mc8pivnf and push.

Let's break down the profiling results and focus on performance improvements. ### Profiling Analysis From your profiler. - `graph = nx.DiGraph(edges)` takes **79.1%** of the time. - `cycles = list(nx.simple_cycles(graph))` takes **20.7%** of the time. - Other lines are negligible. So, **graph construction** from the edge list is the main bottleneck, followed by finding all cycles. --- ## Step 1: Speed Up nx.DiGraph Construction **NetworkX** can be slow for large graphs or when constructing from a dense edge list. However, if your edge list is already an efficient representation (tuples like `(u, v)`), there’s little to optimize with NetworkX itself. ### Suggestions. - If feasible, ensure `edges` is a list or tuple (not a generator or slower structure). - **Avoid unnecessary copies** and build the graph only from what's needed. ### Alternative: Native Algorithms If you only need cycle detection and the nodes, you could avoid **NetworkX** altogether for further speedup—replacing it with a native DFS (Tarjan's algorithm) or similar. But if **NetworkX** and its API must be retained, see below. --- ## Step 2: Optimize Cycle Discovery - `nx.simple_cycles(graph)` is the most efficient in-networkx (Johnson's algorithm); alternatives will likely add more code, but can be faster for certain graph densities. - If you **do not need to enumerate all cycles** and only care about the nodes involved in any cycle, you could compute the **strongly connected components (SCCs)**. Any SCC of size > 1, or with a self-loop, contains a cycle. --- ## Step 3: Optimize Cycle Vertex Extraction Instead of flattening all cycles, collect nodes in SCCs of size > 1 (and nodes with self-loops). This is faster than enumerating all cycles. --- ## Fastest Solution Let’s use **strongly connected components**. For each SCC. - If it contains more than one node, every node may participate in a cycle. - If it contains a single node, check for a self-loop. ### Why is this faster? - **No cycle enumeration needed:** Strongly connected components can be found in linear time. - **No flattening of all cycles:** With SCCs, we grab all nodes at once. --- ## Compatibility - Works with all versions of NetworkX. - No additional dependencies. - Preserves function signature and result. --- ## Summary of Optimizations 1. **Replaces slow call:** `nx.simple_cycles` → much faster SCC analysis. 2. **Minimal code change:** Maintains maintainability. 3. **No unnecessary flattening:** Only collects nodes once. --- **If you are allowed to avoid NetworkX entirely**, let me know for a native, even faster solution! This version, however, will give you a **major speedup** for graphs with cycles. --- ### Full rewritten code. This should deliver **dramatic speed improvement** over the original, especially for larger graphs!

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jun 23, 2025

codeflash-ai bot requested a review from KRRT7 June 23, 2025 06:20

KRRT7 closed this Jun 23, 2025

codeflash-ai bot deleted the codeflash/optimize-find_cycle_vertices-mc8pivnf branch June 23, 2025 23:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `find_cycle_vertices` by 215% #26

⚡️ Speed up function `find_cycle_vertices` by 215% #26

Uh oh!

codeflash-ai bot commented Jun 23, 2025

Uh oh!

Uh oh!

⚡️ Speed up function find_cycle_vertices by 215% #26

⚡️ Speed up function find_cycle_vertices by 215% #26

Uh oh!

Conversation

codeflash-ai bot commented Jun 23, 2025

📄 215% (2.15x) speedup for find_cycle_vertices in src/dsa/nodes.py

📝 Explanation and details

Profiling Analysis

Step 1: Speed Up nx.DiGraph Construction

Suggestions.

Alternative: Native Algorithms

Step 2: Optimize Cycle Discovery

Step 3: Optimize Cycle Vertex Extraction

Fastest Solution

Why is this faster?

Compatibility

Summary of Optimizations

Full rewritten code.

Uh oh!

Uh oh!

⚡️ Speed up function `find_cycle_vertices` by 215% #26

⚡️ Speed up function `find_cycle_vertices` by 215% #26

📄 215% (2.15x) speedup for `find_cycle_vertices` in `src/dsa/nodes.py`