Skip to content

Pull requests: eval-protocol/python-sdk

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

lilac adapter
#389 opened Dec 29, 2025 by shreymodi1 Loading…
enforce single evaluator upload per command
#387 opened Dec 24, 2025 by benjibc Loading…
updated tests
#382 opened Dec 18, 2025 by shreymodi1 Loading…
Trail proxy setup
#380 opened Dec 17, 2025 by xiaoyifan Loading…
support extra headers
#373 opened Dec 15, 2025 by benjibc Loading…
warn if large datasets + force 1 run
#365 opened Dec 12, 2025 by xzrderek Loading…
Shrey/modelquality
#353 opened Dec 2, 2025 by shreymodi1 Loading…
18 tasks
support for tokenids logprobs
#350 opened Nov 26, 2025 by shreymodi1 Loading…
18 tasks
calibration evaluator
#345 opened Nov 24, 2025 by benjibc Draft
18 tasks
adding response quality validation for retry
#344 opened Nov 24, 2025 by morgendave Loading…
10 tasks
tests fix
#341 opened Nov 21, 2025 by shreymodi1 Loading…
18 tasks
Shrey/trl
#335 opened Nov 17, 2025 by shreymodi1 Loading…
18 tasks
Update Klavis MCP use case
#330 opened Nov 14, 2025 by LLiuZheng Loading…
Text to SQL RFT example
#324 opened Nov 10, 2025 by benjibc Loading…
swe-bench
#280 opened Oct 15, 2025 by shreymodi1 Loading…
reasoning effort string change
#267 opened Oct 10, 2025 by shreymodi1 Loading…
18 tasks
reuse pydantic example for local model picking
#251 opened Oct 5, 2025 by benjibc Loading…
pyyaml removal step 1
#247 opened Oct 3, 2025 by benjibc Loading…
directly hit enter to select
#245 opened Oct 2, 2025 by benjibc Loading…
auto convert from dict
#239 opened Sep 30, 2025 by mayinghan Loading…
18 tasks
Route benchmark datasets through data loaders codex
#229 opened Sep 27, 2025 by benjibc Loading…
ProTip! What’s not been updated in a month: updated:<2025-11-29.