-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Closed
Labels
eval[Component] This issue is related to evaluation[Component] This issue is related to evaluation
Description
Opening a new issue related to : #1031
I followed the setup steps, but I’m still encountering an issue when using the following test_config.json:
{
"criteria": {
"tool_trajectory_avg_score": 0.9,
"response_match_score": 0.9
}
}When I run the evaluation using adk eval, I see that it successfully makes the Vertex Gen AI API call, but it fails ith the following error:
AttributeError: 'float' object has no attribute 'item'
adk eval --config_file_path tests/evaluation/test_config.json sub_agents/analyze ests/evaluation/analyze_agent.test.json
Using evaluation criteria: {'tool_trajectory_avg_score': 0.9, 'response_match_score': 0.9}
Running Eval: evalseta62195:case1e7104
Code search tool called with query: Check why nexus configuration might not apply
Computing metrics with a total of 1 Vertex Gen AI Evaluation Service API requests.
100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 4.78it/s]
1 errors encountered during evaluation. Continue to compute summary metrics for the rest of the dataset.
Error encountered for metric rouge_1 at dataset index 0: Error: 500 Internal error encountered.
Evaluation Took:0.2025716999778524 seconds
Eval failed for `evalseta62195:case1e7104`
Traceback (most recent call last):
File "C:\Users\Administrator\Documents\triage-ai-project\.env\Lib\site-packages\google\adk\cli\cli_eval.py", ine 196, in run_evals
evaluation_result = metric_evaluator.evaluate_invocations(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Administrator\Documents\triage-ai-project\.env\Lib\site-ackages\google\adk\evaluation\response_evaluator.py", line 76, in evaluate_invocations
score = self._get_score(eval_case_result)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Administrator\Documents\triage-ai-project\.env\Lib\site-ackages\google\adk\evaluation\response_evaluator.py", line 120, in _get_score
return eval_result.summary_metrics[f"{self._metric_name}/mean"].item()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'float' object has no attribute 'item'
Running Eval: evalseta62195:case940152
> Code search tool called with query: Check why nexus configuration might not apply
Computing metrics with a total of 1 Vertex Gen AI Evaluation Service API requests.
100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 4.80it/s]
1 errors encountered during evaluation. Continue to compute summary metrics for the rest of the dataset.
Error encountered for metric rouge_1 at dataset index 0: Error: 500 Internal error encountered.
Evaluation Took:0.21564670000225306 seconds
Eval failed for `evalseta62195:case940152`
Traceback (most recent call last):
File "C:\Users\Administrator\Documents\triage-ai-project\.env\Lib\site-packages\google\adk\cli\cli_eval.py", ine 196, in run_evals
evaluation_result = metric_evaluator.evaluate_invocations(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Administrator\Documents\triage-ai-project\.env\Lib\site-ackages\google\adk\evaluation\response_evaluator.py", line 76, in evaluate_invocations
score = self._get_score(eval_case_result)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\Administrator\Documents\triage-ai-project\.env\Lib\site-ackages\google\adk\evaluation\response_evaluator.py", line 120, in _get_score
return eval_result.summary_metrics[f"{self._metric_name}/mean"].item()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'float' object has no attribute 'item'
*********************************************************************
Eval Run Summary
It appears the error is triggered here:
return eval_result.summary_metrics[f"{self._metric_name}/mean"].item()If I remove response_match_score from the config and use only:
{
"criteria": {
"tool_trajectory_avg_score": 0.9
}
}…the evaluation completes without raising an exception, though the test cases still fail (as expected due to tool rajectory mismatch).
adk eval --config_file_path tests/evaluation/test_config.json sub_agents/analyze ests/evaluation/analyze_agent.test.json
Using evaluation criteria: {'tool_trajectory_avg_score': 0.9}
Running Eval: evalseta62195:case1e7104
Code search tool called with query: Check why nexus configuration might not apply
Result: ❌ Failed
Running Eval: evalseta62195:case940152
Code search tool called with query: Check why nexus configuration might not apply
Result: ❌ Failed
*********************************************************************
Eval Run Summary
evalseta62195:
Tests passed: 0
Tests failed: 2
Here’s a summary:
- ✅ Vertex AI API call is being made
- ❌
response_match_scorecauses an internal ADK error - ✅ Removing it allows the eval to run without crashing
Could you please confirm if this is a known issue, or if there’s something I need to change on my end?
Thanks again for your help,
Indira
CC: @ankursharmas
Metadata
Metadata
Assignees
Labels
eval[Component] This issue is related to evaluation[Component] This issue is related to evaluation