Feature/compaction truncation #286

janspoerer · 2025-07-15T17:10:51Z

No description provided.

janspoerer · 2025-07-15T17:11:12Z

Old PR: https://github.com/evalstate/fast-agent/pull/252/files

evalstate · 2025-07-20T15:54:22Z

Hi @janspoerer - this one is still in draft; i was about to do some tidying up in this area, but will hold off for the moment to make any potential merge less painful. what do we need to do to undraft this?

janspoerer · 2025-07-23T09:39:43Z

Hi @evalstate,
Even though the truncation works, I would like to make some further changes to the provider classes. I would like to implement these changes, and then the PR can be undrafted.

janspoerer · 2025-07-24T07:10:22Z

Hi @evalstate, I just realized you also wanted to make sure to make merging less painful. Missed that earlier, sorry!

Please don't worry about conflicts from your other changes. I would be happy with merging any conflicts.

storlien · 2025-07-25T11:00:52Z

Hey, @janspoerer

I got this error when trying to test your feature:

num_tokens += len(encoding.encode(message.first_text())) ^^^^^^^^^^^^^^^^^^ AttributeError: 'dict' object has no attribute 'first_text'

Full stack:

[mcp_agent.llm.context_truncation] Model gpt-4.1-mini-2025-04-14 not found. Using cl100k_base tokenizer.
  Finished       | Altinity          / Elapsed Time 00:01:32

Usage Summary (Cumulative)
Agent               Input    Output     Total  Turns  Tools  Context%  Model                    
Altinity_Exe...     2,989        63     3,052      1      0      0.3%  gpt-4.1-mini-2025-04-14  

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/carl.edward.storlien/Documents/Altinn/altinity/fastagent/altinity.py", line 83, in <module>
    asyncio.run(main())
    ~~~~~~~~~~~^^^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ~~~~~~~~~~^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/Users/carl.edward.storlien/.local/share/uv/python/cpython-3.13.5-macos-aarch64-none/lib/python3.13/asyncio/base_events.py", line 725, in run_until_complete
    return future.result()
           ~~~~~~~~~~~~~^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 125, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 125, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/direct_decorators.py", line 125, in wrapper
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 1 more time]
  File "/Users/carl.edward.storlien/Documents/Altinn/altinity/fastagent/altinity.py", line 80, in main
    await agent.interactive()
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/agent_app.py", line 295, in interactive
    return await prompt.prompt_loop(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<5 lines>...
    )
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/interactive_prompt.py", line 210, in prompt_loop
    result = await send_func(user_input, agent)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/agent_app.py", line 279, in send_wrapper
    result = await self.send(message, agent_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/core/agent_app.py", line 95, in send
    return await self._agent(agent_name).send(message)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/agents/base_agent.py", line 226, in send
    response = await self.generate([prompt], None)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/agents/base_agent.py", line 597, in generate
    return await self._llm.generate(multipart_messages, request_params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/augmented_llm.py", line 233, in generate
    assistant_response: PromptMessageMultipart = await self._apply_prompt_provider_specific(
                                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        multipart_messages, request_params
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/providers/augmented_llm_openai.py", line 548, in _apply_prompt_provider_specific
    ] = await self._openai_completion(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<2 lines>...
    )
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/providers/augmented_llm_openai.py", line 346, in _openai_completion
    if self.context_truncation.needs_truncation(
       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        self.history,
        ^^^^^^^^^^^^^
    ...<2 lines>...
        system_prompt,
        ^^^^^^^^^^^^^^
    ):
    ^
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/context_truncation.py", line 87, in needs_truncation
    current_tokens = self._estimate_tokens(memory.get(), model, system_prompt)
  File "/Users/carl.edward.storlien/Documents/Altinn/jan-repo/fast-agent/src/mcp_agent/llm/context_truncation.py", line 70, in _estimate_tokens
    num_tokens += len(encoding.encode(message.first_text()))
                                      ^^^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'first_text'

janspoerer · 2025-07-25T21:35:13Z

Opened another PR that covers this feature: #311

janspoerer · 2025-07-25T21:41:49Z

@storlien

Thank you very much for testing the feature!

I opened another PR as the PR here (286) seemed hopeless.

The method _estimate_tokens_from_message from fast-agent/src/mcp_agent/llm/context_truncation_and_summarization.py now handles the token counting. It is working well, as far as I can see.

I can probably still improve it by not counting the tokens with this new class ContextTruncation, but by using the UsageTracking (01_fast-agent-clones/refactored-augmented-llm/fast-agent/src/mcp_agent/llm/usage_tracking.py). But I guess I'm happy with it right now.

Please note that I've so far only implemented truncation/compaction/summarization for Anthropic and Gemini, not for the OpenAI models. I can see that you used o4. The OpenAI models would be the next in my list to support.

storlien · 2025-07-26T08:06:49Z

@janspoerer

I did make an adhoc solution to the error displayed earlier that fixed it for me in this branch, but currently checking out the branch you mentioned.

I'm trying to get it working with Gemini, but encountered an error that I can post later. Would be extremely grateful if you could implement for OpenAI/Azure asap, that would be very beneficial to the project I'm working on!

janspoerer · 2025-07-26T10:16:21Z

@storlien

Thank you.

I responded to your comment here: #311

janspoerer added 2 commits July 15, 2025 18:44

Summarization seems to work now

53dea1b

Minor change

fe335fe

janspoerer and others added 7 commits July 15, 2025 19:18

Applied ruff check . --fix

9bf949a

Added pytest-mock; minor test fix

158ac90

Fixed synthax error in pyproject.toml

81cdac4

Fixed synthax error in pyproject.toml

34349f9

Changed a test message to make the test better

0c9527c

Fixed the Anthropic-specific truncation

2d45c51

Seems like Anthropic and OpenAI are also working now

8bb123e

janspoerer closed this Jul 25, 2025

janspoerer mentioned this pull request Jul 26, 2025

Feature/compaction #311

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/compaction truncation #286

Feature/compaction truncation #286

Uh oh!

janspoerer commented Jul 15, 2025

Uh oh!

janspoerer commented Jul 15, 2025

Uh oh!

evalstate commented Jul 20, 2025

Uh oh!

janspoerer commented Jul 23, 2025

Uh oh!

janspoerer commented Jul 24, 2025

Uh oh!

storlien commented Jul 25, 2025

Uh oh!

janspoerer commented Jul 25, 2025

Uh oh!

janspoerer commented Jul 25, 2025

Uh oh!

storlien commented Jul 26, 2025 •

edited

Loading

Uh oh!

janspoerer commented Jul 26, 2025

Uh oh!

Uh oh!

Feature/compaction truncation #286

Feature/compaction truncation #286

Uh oh!

Conversation

janspoerer commented Jul 15, 2025

Uh oh!

janspoerer commented Jul 15, 2025

Uh oh!

evalstate commented Jul 20, 2025

Uh oh!

janspoerer commented Jul 23, 2025

Uh oh!

janspoerer commented Jul 24, 2025

Uh oh!

storlien commented Jul 25, 2025

Uh oh!

janspoerer commented Jul 25, 2025

Uh oh!

janspoerer commented Jul 25, 2025

Uh oh!

storlien commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

janspoerer commented Jul 26, 2025

Uh oh!

Uh oh!

storlien commented Jul 26, 2025 •

edited

Loading