-
Notifications
You must be signed in to change notification settings - Fork 483
support code generation tasks. #497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: eval_qa
Are you sure you want to change the base?
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Dsantra92
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes requested
benchmark/download.py
Outdated
| commit = r["commit_id"] | ||
| prob = r["problem_id"] | ||
| mapping.setdefault(repo, []).append((commit, prob)) | ||
| if task == "codegen": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using task, can you just check if it's URL or repo slug?
benchmark/download.py
Outdated
| - base_dir/worktrees/<repo_name>/<worktree_name> | ||
| batch_no: number of batch copies to create per (commit,problem) | ||
| max_workers: number of threads for parallel repo processing | ||
| task: "qa" for legacy format, "codegen" for SWE-bench format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"qa" is not legacy format.
benchmark/metrics.py
Outdated
| Do NOT include any text outside the JSON object. | ||
| """ | ||
|
|
||
| correctness = GEval( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modify it as qa correcteness?
|


No description provided.