-
Notifications
You must be signed in to change notification settings - Fork 80
Hints retrieval in tool use agent #277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review by Korbit AI
Korbit automatically attempts to detect when you fix issues in new commits.
Category | Issue | Status |
---|---|---|
Missing Embedding Cache ▹ view | ||
Incomplete Error Logging ▹ view | ||
Incorrect TaskHint.choose_hints() docstring ▹ view | ||
Missing Goal Validation in Embedding-based Hint Selection ▹ view |
Files scanned
File Path | Reviewed |
---|---|
src/agentlab/llm/tracking.py | ✅ |
src/agentlab/agents/tool_use_agent/tool_use_agent.py | ✅ |
src/agentlab/analyze/agent_xray.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
try: | ||
hint_topic_idx = json.loads(response.think) | ||
if hint_topic_idx < 0 or hint_topic_idx >= len(hint_topics): | ||
logger.error(f"Wrong LLM hint id response: {response.think}, no hints") | ||
return [] | ||
hint_topic = hint_topics[hint_topic_idx] | ||
hint_indices = topic_to_hints[hint_topic] | ||
df = self.hint_db.iloc[hint_indices].copy() | ||
df = df.drop_duplicates(subset=["hint"], keep="first") # leave only unique hints | ||
hints = df["hint"].tolist() | ||
logger.debug(f"LLM hint topic {hint_topic_idx}, chosen hints: {df['hint'].tolist()}") | ||
except json.JSONDecodeError: | ||
logger.error(f"Failed to parse LLM hint id response: {response.think}, no hints") | ||
hints = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete Error Logging 
Tell me more
What is the issue?
The error handling in choose_hints_llm() loses potentially useful error context by only logging the error message without including the original exception details.
Why this matters
Without the original exception details in the logs, debugging production issues will be more difficult as developers won't have access to the full stack trace and error context.
Suggested change ∙ Feature Preview
Include the exception details in the error logging using exc_info=True:
try:
hint_topic_idx = json.loads(response.think)
if hint_topic_idx < 0 or hint_topic_idx >= len(hint_topics):
logger.error(f"Wrong LLM hint id response: {response.think}, no hints")
return []
hint_topic = hint_topics[hint_topic_idx]
hint_indices = topic_to_hints[hint_topic]
df = self.hint_db.iloc[hint_indices].copy()
df = df.drop_duplicates(subset=["hint"], keep="first") # leave only unique hints
hints = df["hint"].tolist()
logger.debug(f"LLM hint topic {hint_topic_idx}, chosen hints: {df['hint'].tolist()}")
except json.JSONDecodeError as e:
logger.error(f"Failed to parse LLM hint id response: {response.think}, no hints", exc_info=True)
hints = []
Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
def choose_hints(self, llm, task_name: str, goal: str) -> list[str]: | ||
"""Choose hints based on the task name.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect TaskHint.choose_hints() docstring 
Tell me more
What is the issue?
The docstring is inaccurate as the method chooses hints based on task_name OR goal depending on hint_retrieval_mode, not just task_name.
Why this matters
Misleading docstring could cause confusion about the method's behavior when using different hint retrieval modes.
Suggested change ∙ Feature Preview
def choose_hints(self, llm, task_name: str, goal: str) -> list[str]:
"""Choose hints based on hint_retrieval_mode using task name or goal text."""
Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
def encode_hints(self): | ||
self.uniq_hints = self.hint_db.drop_duplicates(subset=["hint"], keep="first") | ||
logger.info( | ||
f"Encoding {len(self.uniq_hints)} unique hints using {self.embedder_model} model." | ||
) | ||
self.hint_embeddings = self.emb_model.encode( | ||
self.uniq_hints["hint"].tolist(), prompt="task hint" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing Embedding Cache 
Tell me more
What is the issue?
The hint embeddings are computed on every agent initialization without caching, which is inefficient for repeated runs.
Why this matters
Computing embeddings is computationally expensive and unnecessary to repeat for static hint data. This causes increased startup time and resource usage.
Suggested change ∙ Feature Preview
Implement embedding caching to disk:
def encode_hints(self):
cache_path = Path(self.hint_db_rel_path).parent / 'hint_embeddings.npy'
if cache_path.exists():
self.hint_embeddings = np.load(cache_path)
return
self.uniq_hints = self.hint_db.drop_duplicates(subset=["hint"], keep="first")
self.hint_embeddings = self.emb_model.encode(
self.uniq_hints["hint"].tolist(), prompt="task hint"
)
np.save(cache_path, self.hint_embeddings)
Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
def choose_hints_emb(self, goal: str) -> list[str]: | ||
"""Choose hints using embeddings to filter the hints.""" | ||
goal_embeddings = self.emb_model.encode([goal], prompt="task description") | ||
similarities = self.emb_model.similarity(goal_embeddings, self.hint_embeddings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing Goal Validation in Embedding-based Hint Selection 
Tell me more
What is the issue?
The embedding-based hint selection doesn't handle empty or invalid goals, which could cause the embedding model to fail.
Why this matters
If the goal is empty or contains invalid characters, the embedding model may crash or return unexpected results, breaking the hint retrieval functionality.
Suggested change ∙ Feature Preview
Add input validation before embedding processing:
def choose_hints_emb(self, goal: str) -> list[str]:
"""Choose hints using embeddings to filter the hints."""
if not goal or not goal.strip():
logger.warning("Empty goal provided for embedding-based hint selection")
return []
# Clean and validate goal text
clean_goal = ' '.join(goal.split()) # Remove extra whitespace
if len(clean_goal) < 3: # Arbitrary minimum length
logger.warning(f"Goal text too short for meaningful embedding: {clean_goal}")
return []
goal_embeddings = self.emb_model.encode([clean_goal], prompt="task description")
similarities = self.emb_model.similarity(goal_embeddings, self.hint_embeddings)
Provide feedback to improve future suggestions
💬 Looking for more details? Reply to this comment to chat with Korbit.
Description by Korbit AI
What change is being made?
Enhance the tool use agent with functionality for multiple hint retrieval modes (direct, LLM, and embeddings), including new model configurations, logging, and imports.
Why are these changes being made?
The changes introduce a robust framework for selecting task hints based on contextual relevance, leveraging direct matching, language models, or embeddings. These options provide flexibility in hint selection, optimizing agent performance based on available computational resources and task complexity. Additionally, some minor bug fixes and improvements in logging and configuration management have been made to increase the stability and readability of the codebase.