-
Notifications
You must be signed in to change notification settings - Fork 10
lilac adapter #389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
lilac adapter #389
Conversation
| return [Message.model_validate(m) for m in messages_data] | ||
| except (json.JSONDecodeError, ValueError) as e: | ||
| logger.warning(f"Failed to deserialize messages: {e}") | ||
| return [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uncaught TypeError when messages JSON is non-list
In _deserialize_messages, if messages_json contains valid JSON that is not a list (e.g., a number like "123" or "null"), json.loads succeeds but iterating over the result with for m in messages_data raises a TypeError. This exception is not caught since only json.JSONDecodeError and ValueError are handled, causing the function to propagate an error instead of gracefully returning an empty list like it does for other parse failures.
| try: | ||
| input_metadata = InputMetadata.model_validate_json(data["input_metadata_json"]) | ||
| except (json.JSONDecodeError, ValueError): | ||
| input_metadata = InputMetadata(row_id=data.get("row_id")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pydantic ValidationError not caught, fallback logic unreachable
The except clauses catch (json.JSONDecodeError, ValueError) for Pydantic's model_validate_json() calls, but in Pydantic v2, this method raises pydantic.ValidationError, which is not a subclass of either exception. This means the fallback logic (like InputMetadata(row_id=data.get("row_id"))) is unreachable dead code. Any validation error causes the entire row to be skipped instead of gracefully recovering. Other parts of the codebase correctly catch ValidationError explicitly (e.g., in config.py).
Additional Locations (2)
|
this PR isn't specific to lilac in any way—its an adapter for pandas dataframe. Can we rename the module to reflect that? |
| def _evaluation_row_to_dict(row: EvaluationRow) -> dict[str, Any]: | ||
| """Convert a single EvaluationRow to a dictionary. | ||
|
|
||
| The output contains JSON-serialized fields that can be reconstructed back | ||
| to EvaluationRow. Users can add their own text columns for clustering. | ||
| """ | ||
| return { | ||
| # Identifiers | ||
| "row_id": row.input_metadata.row_id if row.input_metadata else None, | ||
| # Full data as JSON (for reconstruction) | ||
| "messages_json": json.dumps([_serialize_message(m) for m in row.messages]), | ||
| "tools_json": json.dumps(row.tools) if row.tools else None, | ||
| "ground_truth_json": json.dumps(row.ground_truth) if row.ground_truth else None, | ||
| "input_metadata_json": row.input_metadata.model_dump_json() if row.input_metadata else None, | ||
| "execution_metadata_json": row.execution_metadata.model_dump_json() if row.execution_metadata else None, | ||
| "evaluation_result_json": row.evaluation_result.model_dump_json() if row.evaluation_result else None, | ||
| # Scalar fields for filtering | ||
| "score": row.evaluation_result.score if row.evaluation_result else None, | ||
| "message_count": len(row.messages), | ||
| "has_tools": bool(row.tools), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this helper just live on the EvaluationRow for discovery? Also, how was the final set of keys decided? Why don't we just generically serialize the entire EvaluationRow?
Note
Adds a new Lilac integration adapter enabling round-trip conversion between Eval Protocol rows and DataFrames.
eval_protocol/adapters/lilac.pywithevaluation_rows_to_dataframeanddataframe_to_evaluation_rowsmessages_json,tools_json,ground_truth_json,input_metadata_json,execution_metadata_json,evaluation_result_json) plus scalar columns likescore,message_count,has_toolsWritten by Cursor Bugbot for commit 74cca5d. This will update automatically on new commits. Configure here.