-
Notifications
You must be signed in to change notification settings - Fork 10
Record timecost and add tqdm progress bar for rollout scheduler #378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
25b5ec5
fb9c58c
9ceb154
01c0c1e
70b2e17
c7e3abb
2aecc31
e7d638d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -809,9 +809,21 @@ class ExecutionMetadata(BaseModel): | |
|
|
||
| cost_metrics: Optional[CostMetrics] = Field(default=None, description="Cost breakdown for LLM API calls.") | ||
|
|
||
| # deprecated: use rollout_duration_seconds and eval_duration_seconds instead | ||
| duration_seconds: Optional[float] = Field( | ||
| default=None, | ||
| description="Processing duration in seconds for this evaluation row. Note that if it gets retried, this will be the duration of the last attempt.", | ||
| deprecated=True, | ||
| description="[Deprecated] Processing duration in seconds for this evaluation row. Note that if it gets retried, this will be the duration of the last attempt.", | ||
| ) | ||
|
|
||
| rollout_duration_seconds: Optional[float] = Field( | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Regarding retries, I think it would still be valuable to track total_duration_seconds so that people can get a sense of wall clock time for this row. This can be helpful in the UI as well
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. follow up PR work though
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. make sense, i think we should track number of retries as well. for duration we probably still only count the last successful run maybe. for failure i think failure reason matters more |
||
| default=None, | ||
| description="Processing duration in seconds for the rollout of this evaluation row. Note that if it gets retried, this will be the duration of the last attempt.", | ||
| ) | ||
|
|
||
| eval_duration_seconds: Optional[float] = Field( | ||
| default=None, | ||
| description="Processing duration in seconds for the evaluation of this evaluation row. Note that if it gets retried, this will be the duration of the last attempt.", | ||
| ) | ||
|
|
||
| experiment_duration_seconds: Optional[float] = Field( | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://docs.pydantic.dev/latest/concepts/fields/#deprecated-fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha, thanks!