Skip to content

Conversation

@aditya1503
Copy link
Contributor

@aditya1503 aditya1503 commented Oct 16, 2025

This PR introduces mode for TLM RAG evals Binary/continuous
merge after backend PR

query_identifier: Optional[str] = None,
context_identifier: Optional[str] = None,
response_identifier: Optional[str] = None,
mode: Optional[str] = "numeric",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuous

@aditya1503 aditya1503 requested a review from elisno November 4, 2025 14:33
@jwmueller
Copy link
Member

missing unit tests

@aditya1503
Copy link
Contributor Author

missing unit tests

Added here

@elisno
Copy link
Member

elisno commented Dec 4, 2025

What's going on with the formatting check in the CI?

Copy link
Member

@elisno elisno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some initial feedback.

Comment on lines -10 to -11
trustworthy_rag, # noqa: F401
trustworthy_rag_api_key, # noqa: F401
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove these? These fixtures are not defined in conftest.py, so they need to be imported.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this file being updated at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was with hatch automatic format fix, I'll look into these

# Compile and validate the eval
self.mode = self._compile_mode(mode, criteria, name)

def _compile_mode(self, mode: Optional[str], criteria: str, name: str) -> str:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This _compile_mode method needs to be written more carefully to avoid unintentionally breaking all our tests. A lot of the userwarnings will be thrown as errors during automated testing.

# Compile and validate the eval
self.mode = self._compile_mode(mode, criteria, name)

def _compile_mode(self, mode: Optional[str], criteria: str, name: str) -> str:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add separate test cases for these

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants