|
| 1 | +# Using Ragas with Elasticsearch |
| 2 | +Ragas is an evaluation framework that enables you to gain a deeper understanding of how your LLM application performs. Using evaluation methods, such as the Ragas framework, can help you determine if your LLM application is performing as intended and provide a sense of its accuracy. It can enable data-driven decisions about model selection, prompt engineering effectiveness, and retrieval system optimization. |
| 3 | + |
| 4 | +This repository contains a demo using a sample book dataset and Elasticsearch. |
| 5 | + |
| 6 | +## Setting up |
| 7 | +- The version of Python that is used is Python 3.12.1 but you can use any version of Python higher than 3.10. |
| 8 | +- This demo uses Elasticsearch version 9.0.3, but you can use any version of Elasticsearch that is higher than 8.10. |
| 9 | +- You will also need an OpenAI API key for LLM-based metrics. You will want to configure an environment variable for your OpenAI API Key, which you can find on the API keys page in [OpenAI's developer portal](https://platform.openai.com/api-keys). |
| 10 | + |
| 11 | +## Stucture of the example |
| 12 | +- **[books.json](books.json)**: The dataset used containing an sample of books. This is a subset of 25 books from Goodreads including the book title, the author's name, book description, publication year, and a Goodreads URL. |
| 13 | +- **[ragas-elasticsearch-demo.ipynb](ragas-elasticsearch-demo.ipynb)**: Main Jupyter notebook for running ragas evaluations. It sets up the environment, loads data, runs sample queries, and computes evaluation metrics (context precision, faithfulness, answer relevancy) using ragas. |
| 14 | +- **[ragas_evaluation.csv](ragas_evaluation.csv)**: Output file generated by the notebook, containing detailed results for each evaluation query, including metrics for context precision, faithfulness, and answer relevancy. |
| 15 | + |
| 16 | +## Changing the model |
| 17 | +This example is using `GPT-4o`, you can easily change the model to another by adjusting the parameter `model=”model name”`. |
| 18 | + |
| 19 | +```python |
| 20 | +chat_llm = ChatOpenAI( |
| 21 | + model="gpt-4o", |
| 22 | + temperature=0.1, |
| 23 | + api_key=API_KEY |
| 24 | +) |
| 25 | +``` |
| 26 | + |
| 27 | +## Usage |
| 28 | + |
| 29 | +1. **Install Dependencies** |
| 30 | + The notebook will install required dependencies automatically, but you can also install them manually: |
| 31 | + |
| 32 | + ```bash |
| 33 | + pip install -q ragas datasets langchain elasticsearch openai langchain-openai |
| 34 | + ``` |
| 35 | + |
| 36 | +3. **Run the Notebook** |
| 37 | + Open `ragas-elasticsearch-demo.ipynb` in Jupyter and follow the instructions to run each cell. The notebook will: |
| 38 | + |
| 39 | + - Query your book index (via Elasticsearch) |
| 40 | + - Run sample RAG queries |
| 41 | + - Evaluate the responses using Ragas |
| 42 | + - Output results to `ragas_evaluation.csv` |
| 43 | + |
| 44 | +5. **View Results** |
| 45 | + |
| 46 | + The results file contains detailed metrics for each test query. Use it to analyze the quality of your RAG pipeline and compare different configurations. The results will also be in a `csv` file. |
| 47 | + |
| 48 | + |
| 49 | +## Troubleshooting |
| 50 | +- Your model will need to be deployed first before the following code could run. To learn more about this be sure to check out our [documention on the subject](https://www.elastic.co/docs/explore-analyze/machine-learning/nlp/ml-nlp-deploy-model). |
| 51 | +- If you encounter any problems running this in Colab or locally, it might be due to the dataset requiring a separate download. It can be found in the same folder as this example. |
0 commit comments