Add RAG Q&A system evaluation use case example#750
Open
yttrium400 wants to merge 2 commits intouptrain-ai:mainfrom
Open
Add RAG Q&A system evaluation use case example#750yttrium400 wants to merge 2 commits intouptrain-ai:mainfrom
yttrium400 wants to merge 2 commits intouptrain-ai:mainfrom
Conversation
added 2 commits
December 6, 2025 22:54
Closes uptrain-ai#522 - Created a comprehensive notebook showing how to evaluate a RAG-based Q&A system with multiple checks and analysis.
Use JailbreakDetection() class instead of non-existent Evals constant and correct the score field name to score_jailbreak_attempted
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Template
Type of Change
Please delete options that are not relevant.
Description
This PR adds a comprehensive use case example demonstrating how to evaluate a RAG-based question-answering system using UpTrain.
The notebook provides a complete, end-to-end workflow showing developers how to:
The example uses a realistic scenario: a Python programming Q&A system that retrieves documentation and generates responses. It includes intentionally problematic examples (e.g., irrelevant context, hallucinated responses) to demonstrate how UpTrain's evaluations catch these issues.
Fixes #522
Checklist
Author's Note
This is my first contribution to UpTrain! I created this use case example to help newcomers understand how to evaluate RAG systems comprehensively. The notebook:
Files added:
examples/use_cases/rag_qa_system_evaluation.ipynb- Complete RAG evaluation workflowexamples/use_cases/README.md- Documentation for the use_cases directoryKey features:
Note on testing:
The notebook doesn't include automated tests as it's an example/tutorial. It follows the pattern of other notebooks in the
examples/directory which are meant to be run interactively.Note on CHANGELOG:
I noticed the repository doesn't have a CHANGELOG file, so I haven't added an entry there.
Looking forward to feedback!