Skip to content

Hints retrieval in tool use agent #277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Hints retrieval in tool use agent #277

wants to merge 2 commits into from

Conversation

ollmer
Copy link
Collaborator

@ollmer ollmer commented Aug 13, 2025

Description by Korbit AI

What change is being made?

Enhance the tool use agent with functionality for multiple hint retrieval modes (direct, LLM, and embeddings), including new model configurations, logging, and imports.

Why are these changes being made?

The changes introduce a robust framework for selecting task hints based on contextual relevance, leveraging direct matching, language models, or embeddings. These options provide flexibility in hint selection, optimizing agent performance based on available computational resources and task complexity. Additionally, some minor bug fixes and improvements in logging and configuration management have been made to increase the stability and readability of the codebase.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Performance Missing Embedding Cache ▹ view
Error Handling Incomplete Error Logging ▹ view
Documentation Incorrect TaskHint.choose_hints() docstring ▹ view
Functionality Missing Goal Validation in Embedding-based Hint Selection ▹ view
Files scanned
File Path Reviewed
src/agentlab/llm/tracking.py
src/agentlab/agents/tool_use_agent/tool_use_agent.py
src/agentlab/analyze/agent_xray.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

Comment on lines +378 to +391
try:
hint_topic_idx = json.loads(response.think)
if hint_topic_idx < 0 or hint_topic_idx >= len(hint_topics):
logger.error(f"Wrong LLM hint id response: {response.think}, no hints")
return []
hint_topic = hint_topics[hint_topic_idx]
hint_indices = topic_to_hints[hint_topic]
df = self.hint_db.iloc[hint_indices].copy()
df = df.drop_duplicates(subset=["hint"], keep="first") # leave only unique hints
hints = df["hint"].tolist()
logger.debug(f"LLM hint topic {hint_topic_idx}, chosen hints: {df['hint'].tolist()}")
except json.JSONDecodeError:
logger.error(f"Failed to parse LLM hint id response: {response.think}, no hints")
hints = []
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete Error Logging category Error Handling

Tell me more
What is the issue?

The error handling in choose_hints_llm() loses potentially useful error context by only logging the error message without including the original exception details.

Why this matters

Without the original exception details in the logs, debugging production issues will be more difficult as developers won't have access to the full stack trace and error context.

Suggested change ∙ Feature Preview

Include the exception details in the error logging using exc_info=True:

try:
    hint_topic_idx = json.loads(response.think)
    if hint_topic_idx < 0 or hint_topic_idx >= len(hint_topics):
        logger.error(f"Wrong LLM hint id response: {response.think}, no hints")
        return []
    hint_topic = hint_topics[hint_topic_idx]
    hint_indices = topic_to_hints[hint_topic]
    df = self.hint_db.iloc[hint_indices].copy()
    df = df.drop_duplicates(subset=["hint"], keep="first")  # leave only unique hints
    hints = df["hint"].tolist()
    logger.debug(f"LLM hint topic {hint_topic_idx}, chosen hints: {df['hint'].tolist()}")
except json.JSONDecodeError as e:
    logger.error(f"Failed to parse LLM hint id response: {response.think}, no hints", exc_info=True)
    hints = []
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +358 to +359
def choose_hints(self, llm, task_name: str, goal: str) -> list[str]:
"""Choose hints based on the task name."""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect TaskHint.choose_hints() docstring category Documentation

Tell me more
What is the issue?

The docstring is inaccurate as the method chooses hints based on task_name OR goal depending on hint_retrieval_mode, not just task_name.

Why this matters

Misleading docstring could cause confusion about the method's behavior when using different hint retrieval modes.

Suggested change ∙ Feature Preview
def choose_hints(self, llm, task_name: str, goal: str) -> list[str]:
    """Choose hints based on hint_retrieval_mode using task name or goal text."""
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +327 to +334
def encode_hints(self):
self.uniq_hints = self.hint_db.drop_duplicates(subset=["hint"], keep="first")
logger.info(
f"Encoding {len(self.uniq_hints)} unique hints using {self.embedder_model} model."
)
self.hint_embeddings = self.emb_model.encode(
self.uniq_hints["hint"].tolist(), prompt="task hint"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Embedding Cache category Performance

Tell me more
What is the issue?

The hint embeddings are computed on every agent initialization without caching, which is inefficient for repeated runs.

Why this matters

Computing embeddings is computationally expensive and unnecessary to repeat for static hint data. This causes increased startup time and resource usage.

Suggested change ∙ Feature Preview

Implement embedding caching to disk:

def encode_hints(self):
    cache_path = Path(self.hint_db_rel_path).parent / 'hint_embeddings.npy'
    if cache_path.exists():
        self.hint_embeddings = np.load(cache_path)
        return
        
    self.uniq_hints = self.hint_db.drop_duplicates(subset=["hint"], keep="first")
    self.hint_embeddings = self.emb_model.encode(
        self.uniq_hints["hint"].tolist(), prompt="task hint"
    )
    np.save(cache_path, self.hint_embeddings)
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Comment on lines +394 to +397
def choose_hints_emb(self, goal: str) -> list[str]:
"""Choose hints using embeddings to filter the hints."""
goal_embeddings = self.emb_model.encode([goal], prompt="task description")
similarities = self.emb_model.similarity(goal_embeddings, self.hint_embeddings)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Goal Validation in Embedding-based Hint Selection category Functionality

Tell me more
What is the issue?

The embedding-based hint selection doesn't handle empty or invalid goals, which could cause the embedding model to fail.

Why this matters

If the goal is empty or contains invalid characters, the embedding model may crash or return unexpected results, breaking the hint retrieval functionality.

Suggested change ∙ Feature Preview

Add input validation before embedding processing:

def choose_hints_emb(self, goal: str) -> list[str]:
    """Choose hints using embeddings to filter the hints."""
    if not goal or not goal.strip():
        logger.warning("Empty goal provided for embedding-based hint selection")
        return []
    
    # Clean and validate goal text
    clean_goal = ' '.join(goal.split())  # Remove extra whitespace
    if len(clean_goal) < 3:  # Arbitrary minimum length
        logger.warning(f"Goal text too short for meaningful embedding: {clean_goal}")
        return []
    
    goal_embeddings = self.emb_model.encode([clean_goal], prompt="task description")
    similarities = self.emb_model.similarity(goal_embeddings, self.hint_embeddings)
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant