Skip to content

Implement reusable dataset downloder #1002

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

kurotych
Copy link
Member

@kurotych kurotych commented Jun 3, 2025

@kurotych kurotych requested a review from Copilot June 3, 2025 14:17
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request implements a reusable dataset downloader and integrates it into various parts of the mobile_verifier project. Key changes include updating database transaction handling in dataset assignment, refactoring integration tests to use TaskManager with the new daemon, and introducing the new dataset_downloader package with corresponding dependency updates.

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
mobile_verifier/tests/integrations/common/mod.rs Uses a transaction for dataset assignment save operations
mobile_verifier/tests/integrations/boosting_oracles.rs Refactors the dataset downloader integration with TaskManager and updates poll duration
mobile_verifier/src/rewarder.rs Adjusts the import to use the dataset_downloader's check_for_unprocessed_data_sets
mobile_verifier/Cargo.toml Adds a new dependency for dataset_downloader
dataset_downloader/tests/downloader_test.rs Introduces tests covering new dataset downloader functionalities
dataset_downloader/Cargo.toml New package manifest for the dataset_downloader
Cargo.toml Updates workspace members and adds dependencies for regex and async-compression
Comments suppressed due to low confidence (2)

mobile_verifier/tests/integrations/boosting_oracles.rs:147

  • [nitpick] Consider using a more descriptive variable name instead of 'dsdd' to improve code clarity.
let dsdd = DataSetDownloaderDaemon::new(

mobile_verifier/tests/integrations/boosting_oracles.rs:134

  • Verify that a 25ms polling duration is sufficient under expected load and does not introduce flakiness in asynchronous tests.
let poll_duration = std::time::Duration::from_millis(25);

@@ -118,6 +118,9 @@ aws-config = "0.51"
aws-sdk-s3 = "0.21"
aws-types = { version = "0.51", features = ["hardcoded-credentials"]}
tempfile = "3"
regex = "1"
async-compression = { version = "0", features = ["tokio", "gzip"] }
Copy link
Preview

Copilot AI Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider specifying a more explicit version for async-compression to prevent potential dependency resolution issues.

Suggested change
async-compression = { version = "0", features = ["tokio", "gzip"] }
async-compression = { version = "0.3.8", features = ["tokio", "gzip"] }

Copilot uses AI. Check for mistakes.

@kurotych kurotych marked this pull request as ready for review June 16, 2025 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant