Skip to content

⚡️ Speed up function find_cycle_vertices by 215% #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Jun 23, 2025

📄 215% (2.15x) speedup for find_cycle_vertices in src/dsa/nodes.py

⏱️ Runtime : 80.6 milliseconds 25.6 milliseconds (best of 80 runs)

📝 Explanation and details

Let's break down the profiling results and focus on performance improvements.

Profiling Analysis

From your profiler.

  • graph = nx.DiGraph(edges) takes 79.1% of the time.
  • cycles = list(nx.simple_cycles(graph)) takes 20.7% of the time.
  • Other lines are negligible.

So, graph construction from the edge list is the main bottleneck, followed by finding all cycles.


Step 1: Speed Up nx.DiGraph Construction

NetworkX can be slow for large graphs or when constructing from a dense edge list. However, if your edge list is already an efficient representation (tuples like (u, v)), there’s little to optimize with NetworkX itself.

Suggestions.

  • If feasible, ensure edges is a list or tuple (not a generator or slower structure).
  • Avoid unnecessary copies and build the graph only from what's needed.

Alternative: Native Algorithms

If you only need cycle detection and the nodes, you could avoid NetworkX altogether for further speedup—replacing it with a native DFS (Tarjan's algorithm) or similar. But if NetworkX and its API must be retained, see below.


Step 2: Optimize Cycle Discovery

  • nx.simple_cycles(graph) is the most efficient in-networkx (Johnson's algorithm); alternatives will likely add more code, but can be faster for certain graph densities.
  • If you do not need to enumerate all cycles and only care about the nodes involved in any cycle, you could compute the strongly connected components (SCCs). Any SCC of size > 1, or with a self-loop, contains a cycle.

Step 3: Optimize Cycle Vertex Extraction

Instead of flattening all cycles, collect nodes in SCCs of size > 1 (and nodes with self-loops). This is faster than enumerating all cycles.


Fastest Solution

Let’s use strongly connected components. For each SCC.

  • If it contains more than one node, every node may participate in a cycle.
  • If it contains a single node, check for a self-loop.

Why is this faster?

  • No cycle enumeration needed: Strongly connected components can be found in linear time.
  • No flattening of all cycles: With SCCs, we grab all nodes at once.

Compatibility

  • Works with all versions of NetworkX.
  • No additional dependencies.
  • Preserves function signature and result.

Summary of Optimizations

  1. Replaces slow call: nx.simple_cycles → much faster SCC analysis.
  2. Minimal code change: Maintains maintainability.
  3. No unnecessary flattening: Only collects nodes once.

If you are allowed to avoid NetworkX entirely, let me know for a native, even faster solution! This version, however, will give you a major speedup for graphs with cycles.


Full rewritten code.

This should deliver dramatic speed improvement over the original, especially for larger graphs!

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 26 Passed
🌀 Generated Regression Tests 52 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_dsa_nodes.py::test_complex_graph 186μs 43.0μs ✅334%
test_dsa_nodes.py::test_cycle_with_extra_nodes_edges 109μs 36.5μs ✅200%
test_dsa_nodes.py::test_figure_eight 148μs 28.3μs ✅422%
test_dsa_nodes.py::test_multiple_disjoint_cycles 119μs 28.1μs ✅324%
test_dsa_nodes.py::test_multiple_overlapping_cycles 147μs 28.0μs ✅425%
test_dsa_nodes.py::test_no_cycles_dag 39.2μs 22.0μs ✅78.8%
test_dsa_nodes.py::test_self_loop 26.0μs 17.3μs ✅50.1%
test_dsa_nodes.py::test_simple_triangle_cycle 80.8μs 22.1μs ✅265%
test_dsa_nodes.py::test_simple_two_node_cycle 69.4μs 20.1μs ✅245%
test_dsa_nodes.py::test_string_vertices 103μs 34.0μs ✅204%
🌀 Generated Regression Tests and Runtime
import networkx as nx
# imports
import pytest  # used for our unit tests
from src.dsa.nodes import find_cycle_vertices

# unit tests

# ------------------------
# BASIC TEST CASES
# ------------------------

def test_no_edges():
    # No edges, no cycles
    codeflash_output = find_cycle_vertices([]) # 21.9μs -> 11.8μs (85.2% faster)

def test_no_cycles_linear_chain():
    # Linear chain: 1->2->3->4, no cycles
    edges = [(1, 2), (2, 3), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 43.0μs -> 24.2μs (77.3% faster)

def test_single_self_loop():
    # Single node with a self-loop
    edges = [(1, 1)]
    codeflash_output = find_cycle_vertices(edges) # 24.6μs -> 16.6μs (48.2% faster)

def test_simple_cycle():
    # Simple 3-node cycle: 1->2->3->1
    edges = [(1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges) # 82.5μs -> 21.9μs (277% faster)

def test_two_disjoint_cycles():
    # Two disjoint cycles: 1->2->1 and 3->4->5->3
    edges = [(1, 2), (2, 1), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 118μs -> 27.7μs (329% faster)

def test_cycle_with_tail():
    # 1->2->3->1 is a cycle, 4->2 is a tail
    edges = [(1, 2), (2, 3), (3, 1), (4, 2)]
    codeflash_output = find_cycle_vertices(edges) # 86.0μs -> 25.0μs (244% faster)

def test_cycle_with_branch():
    # 1->2->3->1 is a cycle, 2->4 is a branch
    edges = [(1, 2), (2, 3), (3, 1), (2, 4)]
    codeflash_output = find_cycle_vertices(edges) # 87.2μs -> 24.8μs (252% faster)

def test_multiple_overlapping_cycles():
    # 1->2->3->1 and 2->4->5->2
    edges = [(1, 2), (2, 3), (3, 1), (2, 4), (4, 5), (5, 2)]
    # All vertices except 4 and 5 are in at least one cycle
    codeflash_output = find_cycle_vertices(edges) # 149μs -> 28.1μs (433% faster)

def test_cycle_with_extra_edges():
    # 1->2->3->1 is a cycle, 2->4 and 4->5 (no cycle)
    edges = [(1, 2), (2, 3), (3, 1), (2, 4), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 91.7μs -> 27.7μs (231% faster)

# ------------------------
# EDGE TEST CASES
# ------------------------

def test_single_node_no_edges():
    # One node, no edges
    edges = []
    codeflash_output = find_cycle_vertices(edges) # 21.5μs -> 11.1μs (94.0% faster)

def test_single_node_self_loop():
    # One node with a self-loop
    edges = [(0, 0)]
    codeflash_output = find_cycle_vertices(edges) # 25.2μs -> 16.9μs (49.4% faster)

def test_disconnected_graph_with_and_without_cycles():
    # Two components: 1->2->1 (cycle), 3->4 (no cycle)
    edges = [(1, 2), (2, 1), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 76.9μs -> 24.5μs (214% faster)

def test_duplicate_edges_in_cycle():
    # 1->2->3->1, but 1->2 appears twice
    edges = [(1, 2), (1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges) # 81.0μs -> 22.3μs (263% faster)

def test_multiple_self_loops():
    # 1->1, 2->2, 3->4->5->3 (cycle)
    edges = [(1, 1), (2, 2), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 84.8μs -> 28.2μs (200% faster)

def test_cycle_with_noninteger_vertices():
    # Use string labels
    edges = [("a", "b"), ("b", "c"), ("c", "a"), ("d", "e")]
    codeflash_output = find_cycle_vertices(edges) # 94.2μs -> 28.7μs (229% faster)

def test_empty_graph():
    # No nodes, no edges
    codeflash_output = find_cycle_vertices([]) # 21.2μs -> 11.1μs (91.4% faster)

def test_cycle_with_negative_and_zero_vertices():
    # Vertices: -1->0->-1 (cycle), 1->2 (no cycle)
    edges = [(-1, 0), (0, -1), (1, 2)]
    codeflash_output = find_cycle_vertices(edges) # 76.2μs -> 25.2μs (202% faster)

def test_cycle_with_large_labels():
    # Large integer labels
    edges = [(1000000, 2000000), (2000000, 1000000)]
    codeflash_output = find_cycle_vertices(edges) # 73.8μs -> 22.0μs (235% faster)

def test_graph_with_no_cycles_but_many_edges():
    # DAG with many edges, no cycles
    edges = [(i, i+1) for i in range(10)]
    codeflash_output = find_cycle_vertices(edges) # 73.7μs -> 42.9μs (71.7% faster)

# ------------------------
# LARGE SCALE TEST CASES
# ------------------------

def test_large_cycle():
    # Large single cycle: 0->1->2->...->999->0
    n = 1000
    edges = [(i, (i+1)%n) for i in range(n)]
    codeflash_output = find_cycle_vertices(edges) # 12.1ms -> 2.27ms (435% faster)

def test_large_acyclic_graph():
    # Large DAG: 0->1->2->...->999, no cycles
    n = 1000
    edges = [(i, i+1) for i in range(n-1)]
    codeflash_output = find_cycle_vertices(edges) # 3.89ms -> 2.27ms (71.3% faster)

def test_large_graph_with_multiple_cycles_and_branches():
    # Two large cycles and some branches
    n = 500
    # First cycle: 0->1->...->499->0
    cycle1 = [(i, (i+1)%n) for i in range(n)]
    # Second cycle: 500->501->...->999->500
    cycle2 = [(i, i+1) for i in range(n, 2*n-1)] + [(2*n-1, n)]
    # Branches: 250->750, 400->800 (cross edges)
    branches = [(250, 750), (400, 800)]
    edges = cycle1 + cycle2 + branches
    expected = list(range(0, n)) + list(range(n, 2*n))
    codeflash_output = find_cycle_vertices(edges) # 12.4ms -> 2.27ms (445% faster)

def test_large_sparse_graph_with_self_loops():
    # Many nodes, only a few self-loops
    n = 1000
    edges = [(i, i) for i in range(0, n, 100)]
    codeflash_output = find_cycle_vertices(edges) # 45.9μs -> 48.1μs (4.59% slower)

def test_large_graph_with_overlapping_cycles():
    # 0->1->2->...->499->0 (cycle)
    # 250->500->750->250 (overlapping cycle)
    n = 1000
    edges = [(i, (i+1)%500) for i in range(500)]  # first cycle
    edges += [(250, 500), (500, 750), (750, 250)]  # second cycle
    expected = list(range(500)) + [250, 500, 750]
    # Remove duplicates for expected
    expected = sorted(set(expected))
    codeflash_output = find_cycle_vertices(edges) # 6.20ms -> 1.15ms (437% faster)

# ------------------------
# ADDITIONAL EDGE CASES
# ------------------------

def test_cycle_with_tuple_labels():
    # Vertices are tuples
    edges = [((1,2), (2,3)), ((2,3), (3,1)), ((3,1), (1,2))]
    codeflash_output = find_cycle_vertices(edges) # 92.0μs -> 26.0μs (254% faster)


def test_cycle_with_floats():
    # Vertices are floats
    edges = [(1.1, 2.2), (2.2, 1.1)]
    codeflash_output = find_cycle_vertices(edges) # 77.5μs -> 23.5μs (231% faster)

def test_cycle_with_bool_labels():
    # Vertices are booleans
    edges = [(True, False), (False, True)]
    codeflash_output = find_cycle_vertices(edges) # 71.3μs -> 20.3μs (251% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import random

# function to test
# derived from https://github.com/langflow-ai/langflow/pull/5262
import networkx as nx
# imports
import pytest  # used for our unit tests
from src.dsa.nodes import find_cycle_vertices

# unit tests

# ----------------------------
# 1. Basic Test Cases
# ----------------------------

def test_no_edges():
    # No edges, so no cycles
    codeflash_output = find_cycle_vertices([]) # 21.6μs -> 11.0μs (96.6% faster)

def test_single_self_loop():
    # One node with a self-loop forms a cycle
    codeflash_output = find_cycle_vertices([(1, 1)]) # 25.3μs -> 16.8μs (50.9% faster)

def test_two_node_cycle():
    # Two nodes forming a cycle: 1 -> 2 -> 1
    codeflash_output = find_cycle_vertices([(1, 2), (2, 1)]) # 67.8μs -> 19.4μs (249% faster)

def test_three_node_cycle():
    # Three nodes in a cycle: 1 -> 2 -> 3 -> 1
    codeflash_output = find_cycle_vertices([(1, 2), (2, 3), (3, 1)]) # 80.0μs -> 21.7μs (269% faster)

def test_disconnected_graph_with_cycle():
    # Disconnected graph: one component with a cycle, one without
    edges = [(1, 2), (2, 3), (3, 1), (4, 5)]
    codeflash_output = find_cycle_vertices(edges) # 88.8μs -> 26.8μs (232% faster)

def test_disconnected_graph_no_cycle():
    # Disconnected graph with no cycles
    edges = [(1, 2), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 40.2μs -> 22.5μs (78.4% faster)

def test_multiple_cycles():
    # Two cycles: 1->2->3->1 and 4->5->4
    edges = [(1, 2), (2, 3), (3, 1), (4, 5), (5, 4)]
    codeflash_output = find_cycle_vertices(edges) # 117μs -> 27.5μs (325% faster)

def test_cycle_and_tail():
    # 1->2->3->1 is a cycle, 4->1 is a tail into the cycle
    edges = [(1, 2), (2, 3), (3, 1), (4, 1)]
    codeflash_output = find_cycle_vertices(edges) # 84.9μs -> 24.5μs (247% faster)

def test_multiple_cycles_with_shared_vertex():
    # 1->2->3->1 and 3->4->5->3
    edges = [(1, 2), (2, 3), (3, 1), (3, 4), (4, 5), (5, 3)]
    codeflash_output = find_cycle_vertices(edges) # 146μs -> 27.8μs (429% faster)

# ----------------------------
# 2. Edge Test Cases
# ----------------------------

def test_empty_graph():
    # No nodes or edges
    codeflash_output = find_cycle_vertices([]) # 21.3μs -> 11.0μs (93.2% faster)

def test_single_node_no_self_loop():
    # One node, no edges
    codeflash_output = find_cycle_vertices([(1, 2)]) # 32.8μs -> 18.4μs (78.1% faster)

def test_multiple_self_loops():
    # Multiple nodes with self-loops
    edges = [(1, 1), (2, 2), (3, 3)]
    codeflash_output = find_cycle_vertices(edges) # 29.0μs -> 22.5μs (28.6% faster)

def test_parallel_edges():
    # Parallel edges between the same nodes, forming a 2-node cycle
    edges = [(1, 2), (1, 2), (2, 1)]
    codeflash_output = find_cycle_vertices(edges) # 68.1μs -> 19.9μs (242% faster)

def test_large_cycle_with_tail():
    # 1->2->3->4->5->1 is a cycle, 6->1 is a tail
    edges = [(1, 2), (2, 3), (3, 4), (4, 5), (5, 1), (6, 1)]
    codeflash_output = find_cycle_vertices(edges) # 112μs -> 30.0μs (274% faster)

def test_cycle_with_isolated_node():
    # 1->2->3->1 is a cycle, 4 is isolated
    edges = [(1, 2), (2, 3), (3, 1)]
    codeflash_output = find_cycle_vertices(edges) # 80.8μs -> 21.8μs (271% faster)

def test_cycle_with_duplicate_edges():
    # Duplicate edges in the cycle
    edges = [(1, 2), (2, 3), (3, 1), (1, 2)]
    codeflash_output = find_cycle_vertices(edges) # 80.8μs -> 22.3μs (262% faster)

def test_graph_with_no_cycles():
    # Directed acyclic graph (DAG)
    edges = [(1, 2), (2, 3), (3, 4)]
    codeflash_output = find_cycle_vertices(edges) # 41.9μs -> 23.2μs (80.6% faster)

def test_cycle_with_non_integer_nodes():
    # Nodes are strings
    edges = [("a", "b"), ("b", "c"), ("c", "a")]
    codeflash_output = find_cycle_vertices(edges) # 84.8μs -> 23.4μs (263% faster)



def test_large_single_cycle():
    # Large cycle of 1000 nodes
    N = 1000
    edges = [(i, i+1) for i in range(N-1)] + [(N-1, 0)]
    codeflash_output = find_cycle_vertices(edges) # 12.1ms -> 2.26ms (434% faster)

def test_large_acyclic_graph():
    # Large DAG: 1000 nodes in a line
    N = 1000
    edges = [(i, i+1) for i in range(N-1)]
    codeflash_output = find_cycle_vertices(edges) # 3.91ms -> 2.28ms (71.2% faster)

def test_large_graph_with_multiple_cycles():
    # Two large cycles, each of 500 nodes, disjoint
    N = 500
    edges = (
        [(i, i+1) for i in range(N-1)] + [(N-1, 0)] +  # first cycle
        [(N+i, N+i+1) for i in range(N-1)] + [(2*N-1, N)]  # second cycle
    )
    codeflash_output = find_cycle_vertices(edges) # 12.3ms -> 2.27ms (442% faster)

def test_large_sparse_graph_with_few_cycles():
    # 1000 nodes, mostly sparse, with a small cycle
    N = 1000
    edges = [(i, i+1) for i in range(N-1)]  # chain
    # Add a small cycle at the end
    edges += [(997, 998), (998, 999), (999, 997)]
    expected = [997, 998, 999]
    codeflash_output = find_cycle_vertices(edges) # 3.95ms -> 2.28ms (73.4% faster)

def test_large_graph_with_self_loops():
    # 1000 nodes, each with a self-loop
    N = 1000
    edges = [(i, i) for i in range(N)]
    codeflash_output = find_cycle_vertices(edges) # 1.67ms -> 2.52ms (34.0% slower)

def test_large_graph_random_edges_no_cycles():
    # Randomly connect 1000 nodes, but ensure no cycles (DAG)
    N = 1000
    edges = [(i, i+1) for i in range(N-1)]
    # Shuffle edges to ensure no accidental cycle
    random.shuffle(edges)
    codeflash_output = find_cycle_vertices(edges) # 4.22ms -> 2.47ms (70.7% faster)

def test_large_graph_random_edges_with_cycles():
    # Randomly connect 1000 nodes, then add a cycle among the last 10 nodes
    N = 1000
    edges = [(i, i+1) for i in range(N-1)]
    # Add a cycle among the last 10 nodes
    cycle_nodes = list(range(N-10, N))
    for i in range(10):
        edges.append((cycle_nodes[i], cycle_nodes[(i+1)%10]))
    expected = list(range(N-10, N))
    codeflash_output = find_cycle_vertices(edges); result = codeflash_output # 4.01ms -> 2.28ms (76.0% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-find_cycle_vertices-mc8pivnf and push.

Codeflash

Let's break down the profiling results and focus on performance improvements.

### Profiling Analysis

From your profiler.
- `graph = nx.DiGraph(edges)` takes **79.1%** of the time.
- `cycles = list(nx.simple_cycles(graph))` takes **20.7%** of the time.
- Other lines are negligible.

So, **graph construction** from the edge list is the main bottleneck, followed by finding all cycles.

---

## Step 1: Speed Up nx.DiGraph Construction

**NetworkX** can be slow for large graphs or when constructing from a dense edge list. However, if your edge list is already an efficient representation (tuples like `(u, v)`), there’s little to optimize with NetworkX itself.

### Suggestions.
- If feasible, ensure `edges` is a list or tuple (not a generator or slower structure).
- **Avoid unnecessary copies** and build the graph only from what's needed.

### Alternative: Native Algorithms

If you only need cycle detection and the nodes, you could avoid **NetworkX** altogether for further speedup—replacing it with a native DFS (Tarjan's algorithm) or similar. But if **NetworkX** and its API must be retained, see below.

---

## Step 2: Optimize Cycle Discovery

- `nx.simple_cycles(graph)` is the most efficient in-networkx (Johnson's algorithm); alternatives will likely add more code, but can be faster for certain graph densities.
- If you **do not need to enumerate all cycles** and only care about the nodes involved in any cycle, you could compute the **strongly connected components (SCCs)**. Any SCC of size > 1, or with a self-loop, contains a cycle.

---

## Step 3: Optimize Cycle Vertex Extraction

Instead of flattening all cycles, collect nodes in SCCs of size > 1 (and nodes with self-loops). This is faster than enumerating all cycles.

---

## Fastest Solution

Let’s use **strongly connected components**. For each SCC.
- If it contains more than one node, every node may participate in a cycle.
- If it contains a single node, check for a self-loop.



### Why is this faster?
- **No cycle enumeration needed:** Strongly connected components can be found in linear time.
- **No flattening of all cycles:** With SCCs, we grab all nodes at once.

---

## Compatibility

- Works with all versions of NetworkX.
- No additional dependencies.
- Preserves function signature and result.

---

## Summary of Optimizations

1. **Replaces slow call:** `nx.simple_cycles` → much faster SCC analysis.
2. **Minimal code change:** Maintains maintainability.
3. **No unnecessary flattening:** Only collects nodes once.

---

**If you are allowed to avoid NetworkX entirely**, let me know for a native, even faster solution! This version, however, will give you a **major speedup** for graphs with cycles.

---

### Full rewritten code.



This should deliver **dramatic speed improvement** over the original, especially for larger graphs!
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jun 23, 2025
@codeflash-ai codeflash-ai bot requested a review from KRRT7 June 23, 2025 06:20
@KRRT7 KRRT7 closed this Jun 23, 2025
@codeflash-ai codeflash-ai bot deleted the codeflash/optimize-find_cycle_vertices-mc8pivnf branch June 23, 2025 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant