Skip to content

Fails when testing Haiku on AWS Bedrock #247

@NidPlays

Description

@NidPlays

When testing AWS Bedrock, for Haiku, get this error and fails, testing for higher concurrency values.

Error Details

[2024-11-28 14:55:36,221] p11314 {clientwrap.py:98} WARNING - ERROR:fmbench.scripts.stream_responses:Error occurred while generating and computing metrics associated with the streaming response: litellm.APIConnectionError: Bad response code, expected 200: {'status_code': 400, 'headers': {':exception-type': 'serviceUnavailableException', ':content-type': 'application/json', ':message-type': 'exception'}, 'body': b'{"message":"Bedrock is unable to process your request."}'} Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/utils.py", line 7919, in __next__ chunk = next(self.completion_stream) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/llms/bedrock/chat/invoke_handler.py", line 1233, in iter_bytes message = self._parse_message_from_event(event) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/llms/bedrock/chat/invoke_handler.py", line 1259, in _parse_message_from_event raise ValueError(f"Bad response code, expected 200: {response_dict}") ValueError: Bad response code, expected 200: {'status_code': 400, 'headers': {':exception-type': 'serviceUnavailableException', ':content-type': 'application/json', ':message-type': 'exception'}, 'body': b'{"message":"Bedrock is unable to process your request."}'} Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/utils.py", line 7919, in __next__ chunk = next(self.completion_stream) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/llms/bedrock/chat/invoke_handler.py", line 1233, in iter_bytes message = self._parse_message_from_event(event) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/llms/bedrock/chat/invoke_handler.py", line 1259, in _parse_message_from_event raise ValueError(f"Bad response code, expected 200: {response_dict}") ValueError: Bad response code, expected 200: {'status_code': 400, 'headers': {':exception-type': 'serviceUnavailableException', ':content-type': 'application/json', ':message-type': 'exception'}, 'body': b'{"message":"Bedrock is unable to process your request."}'}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/fmbench/scripts/stream_responses.py", line 100, in get_response_stream
for event in event_iterator:
File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/utils.py", line 8012, in next
raise exception_type(
^^^^^^^^^^^^^^^
File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2116, in exception_type
raise e
File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2092, in exception_type
raise APIConnectionError(
litellm.exceptions.APIConnectionError: litellm.APIConnectionError: Bad response code, expected 200: {'status_code': 400, 'headers': {':exception-type': 'serviceUnavailableException', ':content-type': 'application/json', ':message-type': 'exception'}, 'body': b'{"message":"Bedrock is unable to process your request."}'}
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/utils.py", line 7919, in next
chunk = next(self.completion_stream)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/llms/bedrock/chat/invoke_handler.py", line 1233, in iter_bytes
message = self._parse_message_from_event(event)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ec2-user/anaconda3/envs/fmbench_python311/lib/python3.11/site-packages/litellm/llms/bedrock/chat/invoke_handler.py", line 1259, in _parse_message_from_event
raise ValueError(f"Bad response code, expected 200: {response_dict}")
ValueError: Bad response code, expected 200: {'status_code': 400, 'headers': {':exception-type': 'serviceUnavailableException', ':content-type': 'application/json', ':message-type': 'exception'}, 'body': b'{"message":"Bedrock is unable to process your request."}'}

[2024-11-28 14:55:36,222] p11314 {clientwrap.py:98} WARNING - INFO:fmbench.scripts.stream_responses:Final result: None

[2024-11-28 14:55:36,225] p11314 {clientwrap.py:98} WARNING - ERROR:custom_bedrock_predictor:Unexpected error during prediction, endpoint_name=eu.anthropic.claude-3-haiku-20240307-v1:0, exception='NoneType' object has no attribute 'get'

[2024-11-28 14:55:36,552] p11314 {clientwrap.py:91} INFO -
Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new

also want to know if it is possible to fail the request when we reach the rate limit instead of retrying

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions