Skip to content

⚡️ Speed up method CSVSink.parse_detection_data by 6% #55

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Feb 3, 2025

📄 6% (0.06x) speedup for CSVSink.parse_detection_data in supervision/detection/tools/csv_sink.py

⏱️ Runtime : 366 microseconds 345 microseconds (best of 74 runs)

📝 Explanation and details

o3-mini
We optimized the function by caching repeated attribute lookups (e.g. detections.xyxy and detections.data) and avoiding checking for the "data" attribute on every iteration (by precomputing a boolean flag). We also preallocate the result list and use local variables for faster access. The resulting code performs exactly the same functionality while reducing overhead from repeated attribute accesses.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 6 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 83.3%
🌀 Generated Regression Tests Details
from __future__ import annotations

import csv
from typing import Any, Dict, List, Optional

# imports
import pytest  # used for our unit tests
from supervision.detection.tools.csv_sink import CSVSink


class Detections:
    def __init__(self, xyxy, class_id=None, confidence=None, tracker_id=None, data=None):
        self.xyxy = xyxy
        self.class_id = class_id
        self.confidence = confidence
        self.tracker_id = tracker_id
        self.data = data
from supervision.detection.tools.csv_sink import CSVSink

# unit tests

# Basic Functionality






def test_empty_detections():
    detections = Detections(
        xyxy=[],
        class_id=[],
        confidence=[],
        tracker_id=[]
    )
    expected = []
    codeflash_output = CSVSink.parse_detection_data(detections)

# Large Scale Test Cases


def test_invalid_xyxy_format():
    detections = Detections(
        xyxy=[10, 20, 30, 40],  # Incorrect format
        class_id=[1],
        confidence=[0.9],
        tracker_id=[101]
    )
    with pytest.raises(TypeError):
        CSVSink.parse_detection_data(detections)

# Complex Data Structures



from __future__ import annotations

import csv
from typing import Any, Dict, List, Optional

import numpy as np
# imports
import pytest  # used for our unit tests
from supervision.detection.tools.csv_sink import CSVSink


# Mock Detections class for testing
class Detections:
    def __init__(self, xyxy, class_id=None, confidence=None, tracker_id=None, data=None):
        self.xyxy = xyxy
        self.class_id = class_id
        self.confidence = confidence
        self.tracker_id = tracker_id
        self.data = data
from supervision.detection.tools.csv_sink import CSVSink

# unit tests



def test_empty_detections():
    detections = Detections(xyxy=[])
    expected_output = []
    codeflash_output = CSVSink.parse_detection_data(detections)



def test_array_additional_data():
    detections = Detections(xyxy=[[10, 20, 30, 40]], data={"score": np.array([0.95])})
    expected_output = [{"x_min": 10, "y_min": 20, "x_max": 30, "y_max": 40, "class_id": "", "confidence": "", "tracker_id": "", "score": 0.95}]
    codeflash_output = CSVSink.parse_detection_data(detections)



def test_large_additional_data():
    detections = Detections(xyxy=[[i, i+1, i+2, i+3] for i in range(1000)], data={"score": np.array([0.95]*1000)})
    expected_output = [{"x_min": i, "y_min": i+1, "x_max": i+2, "y_max": i+3, "class_id": "", "confidence": "", "tracker_id": "", "score": 0.95} for i in range(1000)]
    codeflash_output = CSVSink.parse_detection_data(detections)

Codeflash

o3-mini
We optimized the function by caching repeated attribute lookups (e.g. detections.xyxy and detections.data) and avoiding checking for the "data" attribute on every iteration (by precomputing a boolean flag). We also preallocate the result list and use local variables for faster access. The resulting code performs exactly the same functionality while reducing overhead from repeated attribute accesses.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Feb 3, 2025
@codeflash-ai codeflash-ai bot requested a review from misrasaurabh1 February 3, 2025 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants